Solved – Cross Validation for Ridge Regression

cross-validationregressionridge regression

I'm using ridge regression for calculating optimal weights of a set of scores. These scores are correlated so the usage of ridge regression is used for penalizing large values of weights. So the purpose of ridge regression is to find beta that minimize the following:

$$
\sum_i{(y_i – x^T_i\beta_i)^2} + \lambda \sum_j{\beta^2_j}
$$

My question is: How do I choose an optimal value for lambda, in the sense of cross validation?
I'm having trouble grasping this conceptually. In classification cross validation is straightforward- split the data into k folds, train on k-1 folds, predict on the last fold and average the prediction error over all folds. How does this work for regression? I can measure the sum of squared distances over each fold, but this is prone to noisy outliers. The reason for using ridge regression instead of standard regression in the first place was not to minimize this.
I looked into the following article but I still don't understand the general approach of using cross validation for choosing an optimal ridge regression model.

Best Answer

The concept is the same in the sense that you need a loss function that is minimized.

Yes, MSE reacts to outliers. But you can use any other loss function that suits your needs better: e.g. mean average error puts less weight on grossly wrong predictions. But there are many other possibilities, see e.g. The Elements of Statistical Learning.

As a side note: neither is classifier optimization as straightforward as you state: average prediction error is not a /proper scoring rule/, so minimizing it doesn't guarantee you that you end up with the optimal model.