Regression – Comprehensive Understanding of Residual Standard Error (RSE)

linear modelregressionresidualsstandard deviationstandard error

I am reading the book " An Introduction to Statistical Learning " and I have trouble understanding their explanation of RSE(Residual Standard Error ) . This is what the book says :

"The RSE is an estimate of the standard deviation of \epsilon . Roughly speaking, it is the average amount that the response will deviate from the true regression line. It is computed using the formula :"

enter image description here

What I don't understand here is definition of RSE as " Standard deviation of \epsilon " .

If Residual = y(i) – yHat(i)

and going by my assumption that \epsilon is also = y(i) – yHat(i)

Isn't the formula of RSE just computing the "root(squared mean)" of \epsilon ?

If \epsilon = y(i) – yHat(i) ,

Then , standard deviation of \epsilon would be
sum[y(i) - yHat(i) - mean(Y - Yhat)] / n-2 which is not what the above formula does .

So , Technically , I think RSE is just squared mean of the Residuals or \epsilon and would be wrong to call it " Standard deviation of \epsilon " , if we go by the actual formula for standard deviation .

Or , In my opinion this should mean the standard deviation of predicted yHat values rather than \epsilon .

So , is this a misuse of the term " standard deviation " in the book or Am I missing something ?

Please correct or help me understand .

Best Answer

I believe this is because in OLS the average of residuals is $0$. Hence the formula for standard deviation of residuals ($\epsilon)$ can be given as:

$$\sqrt\frac{\sum (\epsilon_i - \mu_{\epsilon})^2}{n-2}$$

but since in OLS $\mu_{\epsilon}=0$ by construction (see explanation of why this is so in this Mathematics.SE answer) you are left with:

$$\sqrt\frac{\sum (\epsilon_i - 0)^2}{n-2}= \sqrt\frac{\sum (\epsilon_i)^2}{n-2} \implies \sqrt\frac{\sum (y_i - \hat{y_i})^2}{n-2}$$

since $\epsilon_i=y_i - \hat{y_i}$