I understand the need to standardize variables when performing LASSO in R (I'm specifically using cv.glmnet
, and setting standardize=TRUE
).
The resulting model, however, still fits an intercept. From what I understand, we center the data so that there's no intercept and the model has more freedom to fit coefficients (without the intercept taking up some of the $\Sigma{\beta_j}$), am I doing this correctly? Do I need to set intercept=FALSE
?
Best Answer
The intercept should generally only be omitted if all the predictors and the response have mean=0 (in which case the intercept must necessarily be 0).
Setting
standardize=TRUE
, which is the default option forglmnet::glmnet
, only standardizes the predictors. The function has another parameter to standardize the response, but by default this is set tostandardize.response=FALSE
. So you would want to estimate an intercept unless you have specifiedstandardize.response=TRUE
.