References – Comprehensive Guide to Ridge, Lasso, and Elastic Net Regularization

elastic netlassoreferencesregularizationridge regression

How do ridge, LASSO and elasticnet regularization methods compare? What are their respective advantages and disadvantages? Any good technical paper, or lecture notes would be appreciated as well.

Best Answer

In The Elements of Statistical Learning book, Hastie et al. provide a very insightful and thorough comparison of these shrinkage techniques. The book is available online (pdf). The comparison is done in section 3.4.3, page 69.

The main difference between Lasso and Ridge is the penalty term they use. Ridge uses $L_2$ penalty term which limits the size of the coefficient vector. Lasso uses $L_1$ penalty which imposes sparsity among the coefficients and thus, makes the fitted model more interpretable. Elasticnet is introduced as a compromise between these two techniques, and has a penalty which is a mix of $L_1$ and $L_2$ norms.