Solved – Linear regression $y_i=\beta_0 + \beta_1x_i + \epsilon_i$ covariance between $\bar{y}$ and $\hat{\beta}_1$

covariancelinearregressionsum

I am currently reading through slides from Georgia Tech on linear regression and came across a section that has confused me. It states for
$$
y_i=\beta_0+\beta_1x_i+\epsilon_i
$$

where $\epsilon_i \sim N(0,\sigma^2)$ and
$$
\hat{\beta}_1=\frac{\sum_{i=1}^n(x_i-\bar{x})y_i}{\sum_{i=1}^n(x_i-\bar{x})^2}
$$

the covariance is
$$
\begin{align*}
Cov(\bar{y},\hat{\beta}_1)&=\frac{1}{\sum_{i=1}^n(x_i-\bar{x})^2}Cov\Big(\bar{y},\sum_{i=1}^n(x_i-\bar{x})y_i\Big) \\
&=\frac{\sum_{i=1}^n(x_i-\bar{x})}{\sum_{i=1}^n(x_i-\bar{x})^2}Cov\Big(\bar{y},y_i\Big) \\
&=\frac{\sum_{i=1}^n(x_i-\bar{x})}{\sum_{i=1}^n(x_i-\bar{x})^2}\frac{\sigma^2}{n} \\
&= 0
\end{align*}
$$

Now, I assume that it becomes 0 from the $\sum_{i=1}^n(x_i-\bar{x})=0$ term. However, what is confusing me is how we can pull out the $\sum_{i=1}^n(x_i-\bar{x})$ term from the $Cov\Big(\bar{y},\sum_{i=1}^n(x_i-\bar{x})y_i\Big)$ since the $y_i$ is part of the summation and not constant for all $i$ values (or so I thought).

Best Answer

You don't pull out $\sum_{i=1}^n (x_i - \bar{x})$. Instead, you pull out $(x_i - \bar{x})$ $n$ times. Throughout, the $(x_i)_{1 \leq i \leq n}$ sequence is taken to be non-random. The way the original argument is written up is a little confusing. The following may be easier to understand: $$ \begin{align*} Cov(\bar{y},\hat{\beta}_1)&=\frac{1}{\sum_{j=1}^n(x_j-\bar{x})^2}Cov\Big(\bar{y},\sum_{i=1}^n(x_i-\bar{x})y_i\Big) \\ &=\frac{1}{\sum_{j=1}^n(x_j-\bar{x})^2}Cov\Big(\bar{y},(x_1-\bar{x})y_1 + \dotsm + (x_n-\bar{x})y_n \Big) \\ &=\frac{(x_1-\bar{x})Cov(\bar{y}, y_1) + \dotsm + (x_n-\bar{x})Cov(\bar{y}, y_n)}{\sum_{j=1}^n(x_j-\bar{x})^2} \\ &=\frac{\sum_{i=1}^n(x_i-\bar{x})Cov\Big(\bar{y},y_i\Big)}{\sum_{j=1}^n(x_j-\bar{x})^2} \\ &=\frac{\sum_{i=1}^n(x_i-\bar{x})}{\sum_{j=1}^n(x_j-\bar{x})^2}\frac{\sigma^2}{n} \\ &= 0. \end{align*} $$ To evaluate $Cov(\bar{y}, y_i)$ note that $Cov(y_j, y_i)=0$ for $j \neq i$ and write \begin{align*} Cov(\bar{y}, y_i) =& n^{-1} Cov(y_1 + \dotsm + y_n, y_i) \\ =& n^{-1} \Big(0 + \dotsm + 0 + Cov(y_i, y_i) + 0 \dotsm + 0\Big) \\ =& n^{-1} Var(y_i) = n^{-1} Var(\beta_0 + \beta_1 x_i + \epsilon_i) = n^{-1} Var(\epsilon_i) = \sigma^2/n. \end{align*}

Related Question