Solved – GLMNET or LARS for computing LASSO solutions

lassomachine learningrregressionregularization

I would like to get the coefficients for the LASSO problem

$$||Y-X\beta||+\lambda ||\beta||_1.$$

The problem is that glmnet and lars functions give different answers. For the glmnet function I ask for the coefficients of $\lambda/||Y||$ instead of just $\lambda$, but I still get different answers.

Is this expected? What is the relationship between the lars $\lambda$ and glmnet $\lambda$? I understand that glmnet is faster for LASSO problems but I would like to know which method is more powerful?

deps_stats I am afraid that the size of my dataset is so large that LARS can not handle it, whereas on the other hand glmnet can handle my large dataset.

mpiktas I want to find the solution of (Y-Xb)^2+L\sum|b_j|
but when I ask from the two algorithms(lars & glmnet) for their calculated coefficients for that particular L, I get different answers…and I wondering is that correct/ expected? or I am just using a wrong lambda for the two functions.

Best Answer

In my experience, LARS is faster for small problems, very sparse problems, or very 'wide' problems (much much more features than samples). Indeed, its computational cost is limited by the number of features selected, if you don't compute the full regularization path. On the other hand, for big problems, glmnet (coordinate descent optimization) is faster. Amongst other things, coordinate descent has a good data access pattern (memory-friendly) and it can benefit from redundancy in the data on very large datasets, as it converges with partial fits. In particular, it does not suffer from heavily correlated datasets.

The conclusion that we (the core developers of the scikit-learn) have come too is that, if you do not have strong a priori knowledge of your data, you should rather use glmnet (or coordinate descent optimization, to talk about an algorithm rather than an implementation).

Interesting benchmarks may be compared in Julien Mairal's thesis:

https://lear.inrialpes.fr/people/mairal/resources/pdf/phd_thesis.pdf

Section 1.4, in particular 1.4.5 (page 22)

Julien comes to slightly different conclusions, although his analysis of the problem is similar. I suspect this is because he was very much interested in very wide problems.

Related Solutions

Solved – Why do Lars and Glmnet give different solutions for the Lasso problem

Finally we were able to produce the same solution with both methods! First issue is that glmnet solves the lasso problem as stated in the question, but lars has a slightly different normalization in the objective function, it replaces $\frac{1}{2N}$by $\frac{1}{2}$. Second, both methods normalize the data differently, so the normalization must be swiched off when calling the methods.

To reproduce that, and see that the same solutions for the lasso problem can be computed using lars and glmnet, the following lines in the code above must be changed:

la <- lars(X,Y,intercept=TRUE, max.steps=1000, use.Gram=FALSE)

la <- lars(X,Y,intercept=TRUE, normalize=FALSE, max.steps=1000, use.Gram=FALSE)

and

glm2 <- glmnet(X,Y,family="gaussian",lambda=0.5*la$lambda,thresh=1e-16)

glm2 <- glmnet(X,Y,family="gaussian",lambda=1/nbSamples*la$lambda,standardize=FALSE,thresh=1e-16)

Solved – LARS – LASSO with weights

The glmnet Package solves the lasso problem using coordinate descent. It also provides features for adding in weights

Best Answer

Related Solutions

Solved – Why do Lars and Glmnet give different solutions for the Lasso problem

Solved – LARS – LASSO with weights

Related Question