Solved – Advantages of doing “double lasso” or performing lasso twice

larslassoregressionregularization

I once heard a method of using the lasso twice (like a double-lasso) where you perform lasso on the original set of variables, say S1, obtain a sparse set called S2, and then perform lasso again on set S2 to obtain set S3. Is there a methodological term for this? Also, what are the advantages of doing lasso twice?

Best Answer

Yes, the procedure you are asking (or thinking of) is called the relaxed lasso.

The general idea is that in the process of performing the LASSO for the first time you are probably including "noise variables"; performing the LASSO on a second set of variables (after the first LASSO) gives less competition between variables that are "real competitors" to being part of the model and not just "noise" variables. Technically, what this methods aims to is to overcome the (known) slow convergence of the LASSO in datasets with large number of variables.

You can read more about it on the original paper by Meinshausen (2007).

I also recommend section 3.8.5 on the Elements of Statistical Learning (Hastie, Tibshirani & Friedman, 2008), which gives an overview of other very interesting methods for performing variable selection using the LASSO.