Solved – How to find the smallest $\lambda$ such that all Lasso / Elastic Net coefficients are zero

elastic netglmnetlassomachine learningregression

In the documentation to R's glmnet package it states that when fitting an elastic net, the glmnet function will use a series of $\lambda$ values starting at the smallest $\lambda$ for which all coefficients are zero. How can I find such a value of $\lambda$?

Best Answer

A lasso solution $\widehat{\beta}(\lambda)$ solves $$\min_\beta \frac{1}{2}||y-X\beta||_2^2 +\lambda||\beta||_1.$$ and it is well known that we have $\widehat{\beta}(\lambda)=0$ for all $\lambda \geq \lambda_1$ where $\lambda_1 = \max_j |X_j^Ty|$, which should give you the desired value.

Note that $\lambda_1$ may need a different scaling if the objective function is scaled differently.


Using the cars example with GLMNET:

fit<-glmnet(as.matrix(mtcars[,-1]),mtcars[,1], intercept=FALSE, standardize=FALSE) 1/32*max(abs(t(as.matrix(mtcars[,-1]))%*%mtcars[,1]))/(head(fit$lambda))[1]

This gives the value 1, as expected.

Note that standardize as well as intercept is set to FALSE. If standardize and intercept is set to TRUE, then the value of $\lambda$ is calculated on the scaled regressors. (In this regards, take a look at https://think-lab.github.io/d/205/#5 for how to perform a proper scaling to get the results you want.):

xy<-scale(mtcars) fit<-glmnet(as.matrix(mtcars[,-1]),mtcars[,1]) (1/32*max(abs(t(xy[,-1])%*%mtcars[,1]*sqrt(32/31))))/(head(fit$lambda))[1]

This once again gives the value 1...

However I am not sure what glmnet is calculating if intercept = TRUE but standardize = FALSE.


We saw that glmnet with its standard options calculates $\lambda_{1}$ as $$\lambda_{1} = \max_j| \frac{1}{n} \sum_{i=1}^n x_j^*y|$$, where $x_j^* = \frac{x_j-\overline{x_j}}{\sqrt{\frac{1}{n}\sum_{i=1}^n (x_j-\overline{x_j})^2}}.$

It turns out that for an elastic net problem (corresponding to $\alpha \in (0,1]$ in glmnet) its maximum value $\lambda_{1,\alpha}$ is calculated as

$$\lambda_{1,\alpha}= \lambda_{1}/\alpha$$.

Indeed, setting for example $\alpha=0.3$ we have:

aa<-0.3 xy<-scale(mtcars) fit<-glmnet(as.matrix(mtcars[,-1]),mtcars[,1],a=aa) 1/aa*(1/32*max(abs(t(xy[,-1])%*%mtcars[,1]*sqrt(32/31))))/(head(fit$lambda))[1]

which results once again in an output value of $1$.

That's for the calculations. Note however that the elastic net criterion can be rewritten as a standard lasso problem.