Solved – Difference between RSQ function in Excel and Regression in the Excel Data Analysis Add-in

excelr-squaredregression

When I use the RSQ function between one x variable and y variable, the resulting $R^2$ differs from the $R^2$ values given by running regression analysis from excel's Data Analysis.

For example, here's what I get from regression:

Multiple R          0.738117539
R Square            0.544817502
Adjusted R Square   0.479791431

And RSQ gives me a value of 0.6594.

Here's the data:

Y: 
56,324.0
72,347.0
95,803.3
92,903.7
104,859.9
127,584.9
131,030.9
137,358.6
129,092.2
135,803.4

X:
54.4
65.2
72.5
96.9
61.5
79.5
111.3
111.7
108.6
107.2

Which one is the most accurate out of these 4 values? What's the difference? Also, why are there calculations for Multiple R and Adjusted R Square when I've only done a single variable regression (to my [very limited] understanding, those are for multi variable regressions); should those just be disregarded?

Thanks!

Best Answer

Running this regression in R:

lm(y ~ x, data = data.frame(x = x, y = y))

Provides the $R^2$ and $R^2_{\text{adj}}$ of the regression:

Multiple R-squared:  0.6589 
Adjusted R-squared:  0.6163

Note that $R^2$ and $R^2_{\text{adj}}$ are related by the formula,

$$ R^2_{\text{adj}} = 1 - \frac{n-1}{n-p-1}(1-R^2), $$

with $n$ being the number of observations ($10$ in this case), and $p$ the number of regressors ($1$ in this case). We see that $R^2_{\text{adj}}$ penalizes models with large $p$, in contrast to $R^2$ which never decreases with the addition of a regressor.


See also: