Solved – SLR: Variance of a residual

covarianceregressionresiduals

I am having problems calculating the variance of a residual in an SLR setting,
ie $\text{var}$$(y_i- \hat{y_i})$. Here is what I have thus far.

If $ \hat{y_i}= \hat{\beta_0} + \hat{\beta_1}x_i$ and

$ \ y_i= \beta_0 + \beta_1x_i$

(where $\hat{\beta_0}$ and $\hat{\beta_1}$ are ordinary least squares estimates of $\beta_0$ and $\beta_1$).

$ \text{var}$$(e_i)=\text{var}$$(y_i- \hat{y_i})$
$ \ =\text{var}$$(y_i)+\text{var}$$(\hat{y_i})-2*\text{cov}$$(y_i, \hat{y_i}) $
$ \ =\sigma^2 + \sigma^2( \frac{1}{\ n} +\frac{(x_i-\bar{x})^2\sigma^2}{\sum(xi-\bar{x})^2})-? $

Now what is screwing me up is the covariance term. When I was studying prediction intervals we assume that $ \hat{y_i} $ has no bearing on $\ y_i $. I was told by classmate that this is not the case and that $ \text{cov}$$(y_i, \hat{y_i})=\text{var}$$(y_i) $. Which option is correct, if any? And what is the difference between the variance we find for prediction intervals and this one?

Thanks in advance!

Best Answer

When doing prediction intervals, you're doing that calculation for an observation that's not used in the estimation, so (by the regression assumptions themselves) $\hat{y}_i$ has no bearing on $y_i$.

With residuals the observation is used in the estimation, so the two are dependent.

$\text{cov}(y_i, \hat{y}_i) = \text{cov}(\hat{y}_i+y_i-\hat{y}_i, \hat{y}_i) = \text{cov}(\hat{y}_i, \hat{y}_i) +\text{cov}(y_i-\hat{y_i}, \hat{y}_i) = \text{var}(\hat{y}_i)+0$

For additional details, see here