I was researching k-fold cross-validation, and read that one should train on k-1 of the k partitions and test on the remaining partition, and then repeat for each partition, averaging the results to get an estimate of model performance. This I understand; however, if there is no validation dataset, when should I stop the training (since early stopping is not possible, and so the model cannot simply be trained until generalisation performance starts to worsen)? I.e. should training be stopped after a set number of epochs, or when the gradient falls below a certain limit? Are there any tips for what these stopping parameters should be?
Solved – Neural network training without early stopping
cross-validationneural networks
Best Answer
If there is no validation set, make one: from the training fold keep a few samples out and use them for early stopping.
Other options are: