Solved – how to choose model when training accuracy is lower than validation accuracy while training neural network

machine learningmodelingneural networksoverfittingpredictive-models

Below is a specific case but a general situation I find myself involved within my job. This question is intended at getting ideas on how to pick the best model:

Dataset:

rows: 10,166, features: 1692,
Model used: Feed Forward Neural Networks

logs for a particular fold:

epoch: 0, train loss : 0.585782928946,  train score: 0.497655085604, val score: 0.610834066686 
training  with 32 sized batches
best validation score till now: 0.0

epoch: 1, train loss : 0.350900779755,  train score: 0.509627315655, val score: 0.625354453897 
training  with 32 sized batches
best validation score till now: 0.0

epoch: 2, train loss : 0.249989495512,  train score: 0.528054031557, val score: 0.646059450474 
training  with 32 sized batches
best validation score till now: 0.0

epoch: 3, train loss : 0.210879948519,  train score: 0.553915714449, val score: 0.65603304977 
training  with 32 sized batches
best validation score till now: 0.0

epoch: 4, train loss : 0.196106336583,  train score: 0.578792288474, val score: 0.644788305466 
training  with 32 sized batches
best validation score till now: 0.0

epoch: 5, train loss : 0.189342178014,  train score: 0.601882261054, val score: 0.640632639093 
training  with 32 sized batches
best validation score till now: 0.0

epoch: 6, train loss : 0.18660737301,  train score: 0.616972833149, val score: 0.640021511685 
training  with 32 sized batches
new best validation score: 0.629363449692
saving model...

epoch: 7, train loss : 0.183136423458,  train score: 0.630294120457, val score: 0.629363449692 
training  with 32 sized batches
best validation score till now: 0.629363449692

epoch: 8, train loss : 0.180001894893,  train score: 0.641283331582, val score: 0.611542974479 
training  with 32 sized batches
best validation score till now: 0.629363449692

epoch: 9, train loss : 0.179555135817,  train score: 0.646623580034, val score: 0.606971741469 
training  with 32 sized batches
best validation score till now: 0.629363449692

epoch: 10, train loss : 0.17941892064,  train score: 0.650499675163, val score: 0.603036080962

and the trend of increasing training score and decreasing validation score continues. My question is which model (after which epoch) should be selected. Currently, I select by early stopping based on validation score not having increased for more than 15 epochs and under the condition that training score is greater than the validation score at that epoch. Which makes it model after epoch 7 here.

Best Answer

Note that a problem of feedforward neural networks in general is overfitting, getting a better score on the training set but decreasing performance on the validation and test set. To counter this early stopping is a possibility. The approach I usually use comes from this chapter of a book. There are several stopping criteria possible to use. I recommend you to read this chapter for a good overview.