I have been reading about the Coefficient of Determination and am wondering why it is necessarily less than or equal to 1.
I understand that RSS is the sum of the difference between each dependent variable and it's prediction, squared.
So it makes sense that RSS will be zero if the independent variables perfectly predict the dependent ones.
I understand that TSS is the sum of the difference between the dependent variable and the mean, squared.
But why is RSS/TSS necessarily less than 1?
Best Answer
A simple thought experiment will help answer your question.
TSS is the sum of (Yi - meanY)^2.
Let us assume that we have a regression line with the value of the mean. If the mean is 5, it will simply be a horizontal line with the value of 5 throughout. And, the squared deviations from this line will be (Yi - meanY)^2 because it represents the mean. Now, we have 2 scenarios here:
So, we only have 2 possible scenarios: either RSS = TSS or RSS < TSS. This implies that R-square will always be between 0 and 1.