Machine Learning – Comparing Early Stopping vs Cross Validation to Prevent Overfitting

cross-validationoverfitting

I'm currently using early stopping in my work to prevent over fitting. Specifically those taken form Early Stopping But When?.

I'm now wanting to compare to other classification algorithms where it appears that 10 fold cross validation is widely used.

However I'm confused about whether cross validation is a method for preventing over fitting or selecting good parameters. (or maybe this is one and the same?). I'm also confused whether early stopping methods and cross validation can be used in place of one another or in combination.

So the question is: what is the relationship between early stopping and cross validation?

Best Answer

Cross Validation is a method for estimating the generalisation accuracy of a supervised learning algorithm.

Early stopping is a method for avoiding overfitting and requires a method to assess the relationship between the generalisation accuracy of the learned model and the training accuracy.

So you could use cross validation to replace the validation set, mentioned in the paper you cite, within an early stopping framework. Ten fold cross validation for instance would be more accurate than using a single validation set, and would normally be a better estimate of generalisation error.

So to summarise, cross validation is a generalisation accuracy measure which could be used as part of an early stopping framework.