Solved – Why does the adjusted r-squared of this model improve with addition of a statistically insignificant variable

interceptmultiple regressionr-squaredregressionsas

I stumbled on this while doing MLR, and was curious as to why this happens. The adjusted R-squared is (if I understand correctly) supposed to be a way of comparing the predictive quality of models with different numbers of explanatory variables. In the second model, I've added a statistically insignificant variable (weight), which has apparently improved the model.

My only thought that this is because the point estimate of this value is not 0 – so there may be a 'significant' effect at a lower level. Is that right?

Model 1:

    Model 1
                                         Sum of           Mean
     Source                   DF        Squares         Square    F Value    Pr > F

     Model                     2     6017.30007     3008.65004      12.36    <.0001
     Error                   424         103221      243.44647
     Corrected Total         426         109239


                  Root MSE             15.60277    R-Square     0.0551
                  Dependent Mean      120.03044    Adj R-Sq     0.0506
                  Coeff Var            12.99901


                                   Parameter Estimates

             Parameter     Standard                        Variance
Variable   DF     Estimate        Error  t Value  Pr > |t|    Inflation    95% Confidence Limits

Intercept   1     75.85363      8.91793     8.51    <.0001            0     58.32478     93.38249
age         1      0.66112      0.15314     4.32    <.0001      1.00204      0.36011      0.96212
chol        1      1.86495      0.82213     2.27    0.0238      1.00204      0.24900      3.4809

Model 2 (with addition of insignificant variable):

                  Root MSE             15.58705    R-Square     0.0592
                  Dependent Mean      120.03044    Adj R-Sq     0.0525
                  Coeff Var            12.98591


                                   Parameter Estimates

             Parameter     Standard                        Variance
Variable   DF     Estimate        Error  t Value  Pr > |t|    Inflation    95% Confidence Limits

Intercept   1     57.69446     16.03325     3.60    0.0004            0     26.17970     89.20922
age         1      0.66180      0.15299     4.33    <.0001      1.00205      0.36110      0.96251
chol        1      2.02756      0.82993     2.44    0.0150      1.02320      0.39626      3.65885
weight      1      0.09687      0.07111     1.36    0.1738      1.02122     -0.04290      0.2366

Best Answer

Citing Greene, Econometric Analysis, Theorem 3.7 (referring to the 5th edition here, though):

In a multiple regression, the adjusted $R^2$ will fall (rise) when the variable $x$ is deleted from the regression if the $t$-ratio associated with this variable is greater (less) than 1.

Since the $t$-ratio of weight is $1.36>1$, this had to happen.