Solved – glmnet cross-validation without intercept

glmnetrregression

I am using glmnet in R with leave-one-out cross-validation with this command:

cv.fit <- cv.glmnet(trainx, trainy, maxit=10000000, nfolds=NROW(trainy), grouped=FALSE, lambda.min.ratio=20.0^(-3.0), type.measure="mse")

It produces the information that I want but it also provides a y-intercept. According to the documentation, I can omit the intercept but adding "intercept=FALSE" to the R command:

cv.fit <- cv.glmnet(trainx, trainy, maxit=10000000, nfolds=NROW(trainy), grouped=FALSE, lambda.min.ratio=20.0^(-3.0), type.measure="mse", intercept=FALSE)

However, when I do this, only one of the values is fit (i.e., only one of the coefficients of trainx is active while all other coefficients are found to be inactive; with intercept=TRUE, close to 100 coefficients are active). I've also tried omitting the intercept by excluding it (see "exclude" optional argument) but it does not seem to apply to the intercept.

trainx is a matrix of positive integers (counts) with 314 rows and ~1500 columns. trainy contains 314 rows and a single column that contains the response for the row.

How can I perform the cross-validation without using an intercept?

Best Answer

glmnet optimizes the following loss function:

$\sum_{i=1}^n (\hat{Y}_i-Y_i)^2 + \lambda\left(\frac{(1-\alpha)}{2}||\beta||_2^2 + \alpha ||\beta||_1 \right)$

The residual sums of squares is on the left, as typical with regression, and the penalization for the coefficients is on the right. $\alpha$ defaults to 1, which gives the LASSO penalty.

Now, if you don't fit an intercept, the term on the left will be very large (if $E(Y)$ is large). The model will try to account for that, but it will require larger coefficient values to account for the intercept. It may be the case (and I'm guessing here) that you have $E(Y)$ large and one of your variables is fairly constant. In that case, that variable will get a large coefficient (as it helps to reduce the SSR), but other variables increase the penalization to much and hence there coefficients are zero.

Maybe you could supply your own lambda sequence to the function, something like

lambda=10^seq(1,-4,-.5)

If $\lambda$ is small enough, you should get more non-zero coefficients in the model without an intercept as well.

Note: I don't think this problem has anything to do with the fact you're using cv.glmnet. You should see the same thing if you just use glmnet.