[Math] How to prove the variance of residuals in simple linear regression

statistics

How can I prove the variance of residuals in simple linear regression?

Please help me.

$ \operatorname{var}(r_i)=\sigma^2\left[1-\frac{1}{n}-\dfrac{(x_i-\bar{x})^2}{\sum_{l=1}^{n}(x_l-\bar{x})}\right]$

I tried..

using $r_i=y_i-\hat{y_i}$

$\operatorname{var}(r_i)=\operatorname{var}(y_i-\hat{y_i})=\operatorname{var}(y_i-\bar{y})+\operatorname{var}(\hat{\beta_1}(x_i-\bar{x}))-2\operatorname{Cov}((y_i-\bar{y}),\hat{\beta_1}(x_i-\bar{x}))$

How can I go further?

If there's more information needed, please ask me to provide it.

Best Answer

I believe the previous answer posted is incorrect, since $y_i$ and $\hat y_i$ are not uncorrelated. I would prove this as follows:

$\begin{align} \text{Cov}(r) &= \text{Cov}(y - Py), \quad P = X(X^TX)^{-1}X^T \\ & = \text{Cov}((I_n-P)y) \\ & = (I_n - P)\ \text{Cov}(y)\ (I_n - P)^T \\ & = (I_n-P)\ \sigma^2 I_n\ (I_n - P)^T \end{align}$

from which we can conclude that $\text{var}(r_i)=\sigma^2 (1 - P_{ii})$. It should be quite simple to confirm that your equation is recovered when you let $X$ be the matrix with a column of $1$'s (to represent $\bar x$) and a second column of the $x_i$'s.

Related Question