I am confused how to calculate r-squared for the glmnet fits (LASSO, elastic-net etc). One of the ways I have seen is through the cvm corresponding to one of lambdas:
cvfit2 <- glmnet::cv.glmnet(datam, fundm,alpha=1,nfolds=10)
cf<-coef(cvfit2, s = "lambda.1se")
i<-which(cvfit2$lambda == cvfit2$lambda.1se)
e<-cvfit2$cvm[i]
r2<-1-e/var(fundm)
r2
#[1] 0.4571688
The classic way via calculating the variance of the residuals:
datam2<-as.matrix(datam)
cc2<-as.matrix(cf[-1,]) #removing the intercept row
predict<-datam2 %*% cc2
err<-predict - fundm
View(err)
r2b<-1-var(err)/var(fundm)
r2b
#[1] 0.6100457
Quite a huge difference and I am not sure if the 1st way of calculating $R^2$ is correct.
My questions
-
What is the correct way of calculating r-squared?
-
A glmnet object has components dev.ratio and nulldev. From the glmnet docs:
"The fraction of (null) deviance explained (for "elnet", this is the R-square)."
Should we rather use dev.ratio for the purpose of $R^2$ calculations? If yes, how to extract it for the given lambda index? The dev.ratio array has 100 values, but the cvfit2$lambda
has only 88 values.
I am really confused and would appreciate your feedback.
Best Answer
I'm using
or if you have chosen the lambda.1se
If you do a cross check with the traditional regression lm() and summary()$r.squared it will match the results if weights are close to the elastic net.