Solved – Share of variance explained by one variable

linearregressionvariance

I have two independent variables, x1 and x2.

If I fit a linear model using only x1, I get this result

  • Model 1: R^2 = 0.66, p value = 4.34E-10 (***)

If I fit a linear model using both x1 and x2, I get this result

  • Model 2: R^2 = 0.66, p value x1 = 9.68E-05 (***), p value x2 = 0.56 ( )

Given that there is no increase in R^2 from Model 1 to Model 2, and considering that x2 is not significant, I would say that x2 explains 0% of the variance of the dependent variable.

However, if I fit a linear model using only x2, I get this result

  • Model 3: R^2 = 0.48, p value = 1.17E-06 (***)

In Model 3 x2 alone explains 48% of the variance, and it is highly significant.

So, what is the share of variance explained by x2: 0% or 48%?

Note: the two variables are also correlated, with Pearson r = 0.75

Best Answer

You've actually stumbled on a pretty interesting idea that gets explained in linear regression classes. The answer to your question ("what is the share....") is that it depends on which model you're using.

Because your variables are highly correlated, they explain much of the same variance in the data you're trying to model. That means that while both variables are useful in predicting y on their own, when you put them both into to the model at the same time they're "explaining the same variance" and so only one of them will seem to be "significant".

As an extreme example, imagine fitting a model $y = \beta_{0}x_{1}+\beta_{1}x_{1}+\epsilon$ (this is unrealistic, since your variables are totally co-linear you would run into a slew of issues but it illustrates the idea) and looking to see which of the two variables were useful in predicting y. Of course in the separate models $y = \beta_{0}x_{1}+\epsilon$ and $y = \beta_{1}x_{1}+\epsilon$ you would find that both $x_{1}$ and $x_{1}$ were useful in predicting y. But in the general model, since they both explain the exact same variance in y, you could set either $\beta_{0}$ or $\beta_{1}$ to 0 and predict y just as well.