Solved – Performance comparison of “patternnet” and “newff” for binary classification in MATLAB R2014a

classificationMATLABneural networkssensitivity-specificity

I have a binary classification problem for financial ratios and variables. When I use newff (with trainlm and mse and threshold of 0.5 for output) I have a high classification accuracy (5-fold cross validation – near 89-92%) but when I use patternnet (trainscg with crossentropy) my accuracy is 10% lower than newff. (I normalized data before insert it to network – mapminmax or mapstd)

When I use these models for out-sample data (for current year- created models designed based one previous year(s) data sets) I have better classification accuracies in patternnet with better sensitivity and specificity. For example I have these results in my problem:

Newff:

Accuracy: 92.8% sensitivity: 94.08% specificity: 91.62%

Out sample results: accuracy: 60% sensitivity: 48% and specificity: 65.57%

Patternnet:

Accuracy: 73.31% sensitivity: 69.85% specificity: 76.77%

Out sample results: accuracy: 70% sensitivity: 62.79% and specificity: 73.77%

Why we have these differences between newff and patternent. Which model should I use?

Thanks.

Best Answer

On face value I would recommend using patternnet as it gives you better out of sample performance; the results from newff seems suspiciously good leading me to believe some over-fitting occurs. On that matter check the following link: Improve Neural Network Generalization and Avoid Overfitting.

To comment on the different results: For newff a Levenberg-Marquardt backpropagation is utilized while for patternnet, scaled conjugate gradient backpropagation. In general, different optimization procedures are not guaranteed to arrive in the same result even if they had the target function to optimize against. In your case through you are also using different target functions (mse and crossentropy respectively). It would probably be alarming you if did got the same results as you are fitting different criteria. :)

Having said that, using newff seems a bit odd. It is considered obsolete since R2010b and you are recommend (by the docs) to use feedforwardnet. Try using feeforwardnet first and then decide on which procedure you will ultimately use. As it stands it seems like you comparing the performance of a function (newff) people have not worked on for at least 4 years (if not more) against the performance of a function (patternnet) that is actively developed. It is not really surprising that the latter one it does a better job.

Related Question