Solved – The variance of linear regression estimator $\beta_1$

regressionregression coefficients

Can we say

$$\text{Var}(\beta_1) = \text{Var}\left(\frac{\sum (x_i-\bar x)y_i}{\sum (x_i- \bar x)^2}\right) = \left(\frac{\sum (x_i-\bar x)}{\sum (x_i- \bar x)^2}\right)^2 \text{Var}(y_i) \;\;??$$

I am not sure if I can separate the $x$'s i from $\sum (x_i-\bar x)y_i$. This expression seems like a linear combination of $y_i$'s. Is this legit because every $y_i$ follows the same distribution?

Best Answer

This appears to be simple linear regression. If the $x_i$'s are treated as deterministic, then things like "variance" are not associated with them, and so the expression holds, under the additional assumption that the the error term (and hence $y$ also) has identical distribution for all $i$, and also, that the error terms (and hence $y$ also) are independent for all $j\neq i$.

For compactness, denote $$z_i = \frac{x_i-\bar x}{\sum (x_i- \bar x)^2}$$

Then

$$\text{Var}(\beta_1) = \text{Var}\left(\sum z_iy_i\right)$$

The assumption of deterministic $x$'s permits us to treat them as constants. The assumption of independence permits us to set the covariances between $y_i$ and $y_j$ equal to zero. These two give

$$\text{Var}(\beta_1) = \sum z_i^2\text{Var}(y_i)$$

Finally, the assumption of identically distributed $y$'s implies that $\text{Var}(y_i)= \text{Var}(y_j) \;\; \forall i,j$ and so permits us to write

$$\text{Var}(\beta_1) = \text{Var}(y_i)\sum z_i^2$$