Solved – Calculating $R^2$ for Elastic Net

elastic netmodel-evaluationr-squared

I am trying to do variable selection using elastic net (Matlab Lasso function with alpha of 0.5). I have 75 predictors in total (some are correlated with each other, hence using elastic net instead of lasso), and I would like to get a subset of them, which are good predictors for my outcome.

So my question is: How can I calculate something like $R^2$ that show how much of my outcome is explained by these selected variables?

  1. If I use the selected variables in a multiple linear regression model, is the $R^2$ gonna be valid, since my variables are correlated?

  2. Can I calculate cross-validated $R^2$ (using leave-one-out) to get a more accurate $R^2$?

  3. Is there any other way than calculating $R^2$ that I show my variable selection method predicts well?

Best Answer

Just use the regular $R^2$, i.e. the squared correlation between the fitted and the actual values. Whether the model was fit by OLS or by penalized OLS (such as the elastic net), it will still reflect the proportion of variance explained.

Be aware, however, that model diagnostics and performance measures (such as $R^2$) applied after model selection may (and will) be overly optimistic if the model is evaluated on the same data that was used for model building (e.g. variable selection).

  1. Apart from the warning above, correlated variables are not a problem for $R^2$.
  2. If you were to predict the left-out fold in $K$-fold cross validation and base the $R^2$ on prediction accuracy, then it is not a very useful measure because $R^2$ ignores prediction bias and only accounts for prediction variance. But I am not sure I understand what you mean by cross-validated $R^2$.
  3. Split your data into training, validation and test subsets. Train your models on the training data and use validation data to pick the best-performing model. Re-estimate your selected model on test+validation data. Then assess the performance of the re-estimated model on the test data. Use mean squared error instead of $R^2$ since to properly account for any prediction bias extra to prediction variance.
Related Question