ANOVA – Comparison with R-Squared (Explained Variance)

anovalinear modelr-squared

My goal is to determine whether for linear regression some predictors uniquely improve the fit beyond that which is already available via all other predictors combined. I have originally tried multi-way ANOVA and partial correlation for this purpose. I have recently learned that multi-way ANOVA performs poorly in the scenario with high multicollinearity. Namely, the reported significances and explained variances of individual predictors are not robust, and thus may misrepresent the true relations within the data.

Here is a solution that came to my mind:

  1. Fit the full model to the data, find the coefficient of determination $r^2_{full}$
  2. Exclude one of the predictors (e.g. $X$) from the model, fit the rest of the predictors, find $r^2_{/X}$
  3. Then the gain of explained variance uniquely due to the excluded predictor $X$ is

$$G(X) = r^2_{full} – r^2_{/X}$$

Naively, this looks like a robust solution for detecting partial effects. My simulations show that it works better than partial correlation on some simple noisy model data, whereas the latter is known to fail to correctly discriminate between a true partial effect and multicollinearity in the presence of noise.

Questions:

  • Does this approach have a name?
  • Does it work in practice?
  • Is there a nice procedure to test $G(X)$ for significance (against null hypothesis that $X$ is random and can only explain variance by chance)? Permutation-testing seems to work for me, I'm just wondering if there is something similar to an F-test.

Note: I am only interested in applying the method for low total number of predictors such as 2 or 3. I am aware of the kitchen sink regression effect, so I want to make it clear that I do not intend to stretch this design to the extreme.

Best Answer

If you are willing to assume normally distributed errors, you could simply apply a likelihood ratio test as the models you want to compare are nested.

Edit: Assuming normal errors $\epsilon$ (with variance = 1), the likelihood function of the parameters in the model $$y_i = \theta_1x_{1i} + \theta_2x_{2i} + \theta_3x_{3i} + \epsilon_i$$ is given by $$\mathcal L(\theta) = (2\pi)^{-\frac 32}\exp\left(-\frac 12\Vert y - X\theta\Vert_2^2\right),$$ where $y$ is the n×1 vector of $y_i$'s, $X$ is the n×3 matrix of $(x_{1i}, x_{2i}, x_{3i})'$ and $\theta = (\theta_1,\theta_2,\theta_3)'$. The likelihood ratio test for testing, say, $\theta_3 = 0$ is given by $$\text{LRT} = \frac{\sup_{\theta\in\mathbb R^2\times\{0\}}\mathcal L(\theta)}{\sup_{\theta\in\mathbb R^3}\mathcal L(\theta)},$$ which is approximately $\chi^2$-distributed with 1 degree of freedom.

In fact, this test does not test whether the improvement in the goodness of fit in terms of the $R^2$ is significant, however, it tests whether the improvement in terms of the likelihood, which could be seen as some measure of goodness of fit (see, for instance, the Nagelkerke $R^2$, McFadden $R^2$, Cox and Snell's $R^2$), is statistically significant.

The $R^2$, in the above example, has a Beta distribution with shape parameters $1$ and $(n - 3)/2$. However, the linear combination of two dependent (!) Beta distributed random variables is, I guess, no standard distribution. I don't think that following this track will help you achieving what you want. This is why I suggested to test the difference in likelihoods using the likelihood ratio test.