Get R-Squared after doing stepwise model selection in regression in R

linear modelmodel selectionrregression

I am using R commander to do stepwise model selection in a linear model. When I run stepwise model selection, it reduces some variables, and finally, a model with AIC is provided. However, it does not show the R-squared. I am wondering how I can determine the new value of R-squared? Moreover, is it possible to do stepwise model selection based on adjusted R-squared, not AIC nor BIC?

Best Answer

A number of summary statistics of a linear model are not stored in the returned object of class lm, but can be extracted from it with the summary() method:

> fit <- lm(waiting ~ eruptions, faithful)
> names(summary(fit))
 [1] "call"          "terms"         "residuals"     "coefficients" 
 [5] "aliased"       "sigma"         "df"            "r.squared"    
 [9] "adj.r.squared" "fstatistic"    "cov.unscaled"

Adjusted R squared is stored in adj.r.squared. Beware that adjusted R squared is still an in sample estimate, i.e., the data used for fitting the model are also used for evaluation and it is thus still prone to overfitting when used as a model selection cirterion. To compute the out-of-sample R squared, you can fit with glm instead of lm (family=gaussian is equivalent to lm, although the optimization algorithm is different) and use the function cv.glm() from the package boot.

And now a comment for collecting downvotes ;-) There are problems with stepwise selection and it is possible to construct cases where it fails. On the other hand, it is much better than its reputation: an extended study by Hastie and Tibshirani found it to be comparable to an exhaustive full optimal search "throughout", which is quite amazing considering the strong advises against stepwise selection here on CV.

Hastie, Trevor, Robert Tibshirani, and Ryan J. Tibshirani. "Extended comparisons of best subset selection, forward stepwise selection, and the lasso." arXiv:1707.08692 (2017).