Solved – When to use and when not to use ridge regression

multicollinearityregressionridge regression

What are the 'indications' (i.e. when to use) and 'contra-indications' (i.e. when not to use) of ridge-regression. I tried to read up on the net and it seems to useful when multi-collinearity is there amongst the predictor variables. But what about other situations / assumptions for ridge regression? Do the predictor variables need to be normally distributed? How does it fare with small data sets? Is there a problem if categorical predictor variables are included? How many predictor variables can be included relative to number of rows, etc. Thanks for your insight.

Best Answer

The predictor variables don't need to be normally distributed. Instead, the noise term $\epsilon$ is normally distributed: $$y=w^{T}x+\epsilon$$ If we build the model using maximum likelihood estimates, then it's just linear regression without regularization. If we assume the weights have a Gaussian prior and use MAP estimates, then it will become ridge regression. But in either case, predictor variables need not be normally distributed.

As for when to use ridge regression, in most cases ridge regression outperforms other regularization methods (say L1-norm). You can refer to the following FAQ page by the LIBSVM author:

https://www.csie.ntu.edu.tw/~cjlin/liblinear/FAQ.html#l1_regularized_classification

Therefore I will say try ridge regression first. You may also want to try L1-regularization (LASSO) or elastic nets when:

  1. You know some of the features you are including in your model might be zero (i.e., you know the some coefficients in the "true model" are zero)

  2. Your features do not highly correlate with each other

  3. You want to perform feature selection but don't want to use wrapper/filter approaches

The above reasons are exactly the nature of L1-regularization. The LASSO yields sparse output, which can be viewed as built-in feature selection. However it only selects one feature from a bunch of highly correlated features.

Related Question