R-Squared – Relation Between $R^2$ of Simple Regression and Multiple Regression

least squaresmultiple regressionr-squaredregression

A very basic question concerning the $R^2$ of OLS regressions

  1. run OLS regression y ~ x1, we have an $R^2$, say 0.3
  2. run OLS regression y ~ x2, we have another $R^2$, say 0.4
  3. now we run a regression y ~ x1 + x2, what value can this regression's R squared be?

I think it's clear the $R^2$ for the multiple regression should be no less than 0.4, but is it possible for it to be more than 0.7?

Best Answer

The second regressor can simply make up for what the first did not manage to explain in the dependent variable. Here is a numerical example:

Generate x1 as a standard normal regressor, sample size 20. Without loss of generality, take $y_i=0.5x_{1i}+u_i$, where $u_i$ is $N(0,1)$, too. Now, take the second regressor x2 as simply the difference between the dependent variable and the first regressor.

n <- 20 
x1 <- rnorm(n)

y <- .5*x1 + rnorm(n)

x2 <- y - x1
summary(lm(y~x1))$r.squared
summary(lm(y~x2))$r.squared
summary(lm(y~x1+x2))$r.squared