Solved – How is the intercept computed in GLMnet

elastic netglmnetlassorregularization

I've been implementing the GLMNET version of elastic net for linear regression with another software than R. I compared my results with the R function glmnet in lasso mode on diabetes data.

The variable selection is ok when varying the value of the parameter (lambda) but I obtain slightly different values of coefficients. For this and other reasons I think it comes from the intercept in the update loop, when I compute the current fit, because I don't vary the intercept (which I take as the mean of the target variable) in the whole algorithm : as explained in Trevor Hastie's article ( Regularization Paths for Generalized Linear Models via Coordinate Descent, Page 7, section 2.6):

the intercept is not regularized, […] for all values of […] lambda [the L1-constraint parameter]

But despite the article, the R function glmnet does provide different values for the intercept along the regularization path (the lambda different values). Does anyone has a clue about how the values of the Intercept are computed?

Best Answer

I found that the intercept in GLMnet is computed after the new coefficients updates have converged. The intercept is computed with the means of the $y_i$'s and the mean of the $x_{ij}$'s. The formula is siimilar to the previous one I gave but with the $\beta_j$'s after the update loop : $\beta_0=\bar{y}-\sum_{j=1}^{p} \hat{\beta_j} \bar{x_j}$.

In python this gives something like :

        self.intercept_ = ymean - np.dot(Xmean, self.coef_.T)

which I found here on scikit-learn page.

EDIT : the coefficients have to be standardized before :

        self.coef_ = self.coef_ / X_std

$\beta_0=\bar{y}-\sum_{j=1}^{p} \frac{\hat{\beta_j} \bar{x_j}}{\sum_{i=1}^{n} x_{ij}^2}$.