Solved – Glmnet: How to select Lambda and Alpha

elastic netglmnetlassorridge regression

I'd like to pick the optimal lambda and alpha using the Glmnet package. I'm open to all models (Ridge, Lasso, Elastic). I'm assuming some out of sample error/cross validation is the best model selection criteria.

Macro <- read.csv("P:/Earnest/Old/R/Input.csv")
x <- Macro[1:13,3:21]
x <- as.matrix(x)

y <- Macro[1:13,2:2]
y <- as.matrix(y)

t <- Macro[14:14,3:21]
t <- as.matrix(t)

Right now, I'm using the following code. The below code presupposes alpha = .5 (elastic), and that lambda.min is the ideal lambda.

fit <-glmnet(x, y, alpha = .5, lambda = NULL) 
cv.fit=cv.glmnet(x,y, alpha = .5, lambda = NULL)
min <- cv.fit$lambda.min
predict(fit ,t, s = min)

Questions: How do I know what is the ideal alpha and lambda? What code can I use to test various lambda/alpha combinations, to find the best out of sample error. Is this the right approach and utilization of Glmnet? What questions am I not considering that should be?

The data:

Best Answer

It appears that the default in glmnet is to select lambda from a range of values from min.lambda to max.lambda, then the optimal is selected based on cross validation. The range of values chosen by default is just a linear range (on the log scale) from a the minimum value (like 0, or some value for which we set no features to zero) to the maximum value, (which they set to the smallest value for which the model would set all features to zero).

From the glmnet documentation:

lambda can be provided, but is typically not and the program constructs a sequence. When automatically generated, the λ sequence is determined by lambda.max and lambda.min.ratio. The latter is the ratio of smallest value of the generated λ sequence (say lambda.min) to lambda.max. The program then generates nlambda values linear on the log scale from lambda.max down to lambda.min. lambda.max is not given, but easily computed from the input x and y; it is the smallest value for lambda such that all the coefficients are zero. For alpha=0(ridge) lambda.max would be ∞; hence for this case we pick a value corresponding to a small value for alpha close to zero.)

Related Solutions

Solved – How to interpret this glmnet() code and its output in R

This smells incorrect, you probably wanted:

fit <- cv.glmnet(model, y, k=k)
coef(fit, "lambda.min")

which will return the coefficients using the internal fit from the cross validation.

Unless ridge_model has the same predictors, weights, mixing parameter, etc, plugging in a penalty parameter from one model into another seems odd; but if that were the same, ridge_model would be the same as fit$glmnet.fit above and redundant.

Solved – Building final model in glmnet after cross validation

Instead of performing a cross validation for each set of variables separately using a penalized regression, the cv.gmlnet function does this automatically:

library(glmnet)
data(QuickStartExample)

# your approach: use different lambdas and perform cross validation maually
fit_1 = glmnet(x, y,lambda = 1)


# glmnet's approach: automated cross validation
cvfit = cv.glmnet(x, y)
plot(cvfit)

# coeficients of the final model
coef_cv=coef(cvfit, s = "lambda.min")
# prediction of the final model
predict(cvfit, newx = x[1:5,], s = "lambda.min")

# extract optimal lambda
lmabda_opt=cvfit$lambda.min 

# manually plugging lambda into glmnet
fit_2 = glmnet(x, y,lambda = lmabda_opt) 

# compare cefficients - equal
cbind(coef_cv,coef(fit_2))

# compare predictions - equal
cbind(predict(cvfit, newx = x[1:5,], s = "lambda.min"),predict(fit_2, newx = x[1:5,]))

So for each lambda, a cross validation is performed and a performance meansure is calculated. Via plot(cvfit) you can see the result of the cross validation. Recall, that generally using glmnet() and plugging in arbitrary lambdas is not recommended. More detals can be found in the excellent tutorial: https://web.stanford.edu/~hastie/Papers/Glmnet_Vignette.pdf

Best Answer

Related Solutions

Solved – How to interpret this glmnet() code and its output in R

Solved – Building final model in glmnet after cross validation

Related Question