Solved – Why is Lasso regression for high dimensional data better than Stepwise AIC

lassomodel selectionmultivariablestepwise regression

I know Lasso eventually set some parameters to zero, acting like variable selection. I also read from paper talking about automated variable selection method like Stepwise AIC can be troublesome. So what are the advantages of using Lasso for variable selection over using the automated procedures like Stepwise AIC?

Best Answer

Selection of variables on the basis of the magnitude of their observed regression coefficients, or on the basis of their statistical "significance" (and AIC is a 1-1 function of the P-value) will result in selection of variables because their effects were measured with error in the direction of being too far from zero. The lasso as well as other penalized methods penalize you for the context of how hard you have to work to find good predictors. Penalization lowers the chance of overstating a regression coefficient.