Solved – Neural network training without early stopping

cross-validationneural networks

I was researching k-fold cross-validation, and read that one should train on k-1 of the k partitions and test on the remaining partition, and then repeat for each partition, averaging the results to get an estimate of model performance. This I understand; however, if there is no validation dataset, when should I stop the training (since early stopping is not possible, and so the model cannot simply be trained until generalisation performance starts to worsen)? I.e. should training be stopped after a set number of epochs, or when the gradient falls below a certain limit? Are there any tips for what these stopping parameters should be?

Best Answer

If there is no validation set, make one: from the training fold keep a few samples out and use them for early stopping.

Other options are:

  1. Train until training error converges. If you have enough data and the model is regularized, you can avoid overfitting and this becomes a reliable measure.
  2. Look for "Optimized Approximation Algorithm" paper; they describe a method for monitoring test performance by analyzing signal-to-noise ratio of the training error. I don't have a practical experience with the method though, so unfortunately I can't tell you how efficient it is.