Regularization and Cross validation are two of the most important techniques for preventing overfitting, but it's not clear to me when one should be used over the other, or when both should be used together.

So my questions are:

(1) When would you use regularization over CV?

(2) When would you use CV over regularization?

(3) When would you use CV and regularization together?

(4) When would you use some other method to prevent overfitting?

## Best Answer

This was clarified in comment

Not exactly. For example, imagine that your data has $n$ samples and $p$ features with $n \ll p$. In such case fitting a multivariate regression model to all the features would overfit, so you need a model with less features. You could compare all the possible models, i.e. all the possible combinations of the $p$ features, and use

cross validationto pick the best one, but this would likely be time consuming.Regularization, does this in single step by penalizing the overtly complex models. One example is LASSO regression that will push the regression coefficients of the "unnecessary" variables to zero (so technically, remove them). With fitting regularized multivariate regression you need to fit one model, instead of $2^p$ models, so that ismuchfaster solution. Also, the regularized model still can useallthe features (parameters do not need to be shrinked to exact zero), rather then selecting best features, such model is not possible when doing feature selection alone. Additionally, you can check this thread for learning why this is not that simple.More generally, cross validation and regularization serve different tasks. Cross validation is about

choosing the "best" model, where "best" is defined in terms of test set performance. Regularization is aboutsimplifying the model. They could, but do not have to, result in similar solutions. Moreover, to check if the regularized model works better then unregularized you would still need cross validation.