Variance – Is Coefficient of Determination Explained Variance or Ratio of Explained to Total Variance?

r-squaredvariance

I have found in multiple texts that the coefficient of determination, or R-squared, is often referred to as the "variance explained". When, to be precise, it seems to be the ratio of the variance explained to the total variance.

Wouldn't the variance explained be the total variance – unexplained variance?

Best Answer

Total variance decomposes into the explained variance and unexplained variance (more complicated than this if you deviate from ordinary least squares, but I think you’re in that setting).

$$ \text{Total Variance}=\text{Unexplained Variance} + \text{Explained Variance} $$

These often go by other terms, total sum of squares (TSS), sum of squared residuals (SSRes) and sum of squares of the regression (SSReg).

$$ TSS = SSRes + SSReg $$

Then…

$$ R^2=\dfrac{SSReg}{TSS}=\dfrac{TSS-SSRes}{TSS}\\ =1-\dfrac{SSRes}{TSS} $$

You’re right that unexplained variance is total variance minus explained variance, but that is contained in the equation for $R^2$. If you learned a different way to calculate $R^2$, it would be equivalent to what I gave above, even if the equivalence is not obvious.

If your reference says that $R^2$ is the explained variance, they are being a bit loose with their phrasing for my taste. While $R^2$ is related to the explained variance, the two are not synonyms.