A question on linear regression of why Var($\hat{\beta_1}~|~X=x_i$) = $\frac{\sigma^2}{\sum_{i=1}^{n} x_i^2}$

linear regressionstatisticsvariance


First, notice that $Var(e_i | X) = \sigma^2$ and $E(y_i | X = x)$.
Also, $E(y_i | X = x) = \beta_0 + \beta_1 x$.
Then, $Var(y_i | X = x) = Var(e_i | X = x) = \sigma^2$.

Since, $\sum_{i=1}^n (x_i – \bar{x}) = 0$.
Then, $\hat{\beta_1} = \frac{\sum_{i=0}^{n} (x_i – \bar{x})(y_i – \bar{y})}{\sum_{i=0}^{n} (x_i – \bar{x})^2} = \frac{\sum_{i=0}^{n} (x_i – \bar{x}) y_i}{\sum_{i=0}^{n} (x_i – \bar{x})^2}$.

Since, Var($\hat{\beta_1}~|~X=x_i$) = Var($\frac{\sum_{i=0}^{n} (x_i – \bar{x})(y_i – \bar{y})}{\sum_{i=0}^{n} (x_i – \bar{x})^2} | X=x_i$) = Var($\frac{\sum_{i=0}^{n} (x_i – \bar{x}) y_i}{\sum_{i=0}^{n} (x_i – \bar{x})^2} | X=x_i$)

Also, $\frac{(x_i – \bar{x})}{\sum_{i=0}^{n} (x_i – \bar{x})^2}$ is a constant and by the independence of y_i.
Then, we have that
Var($\frac{\sum_{i=0}^{n} (x_i – \bar{x}) y_i}{\sum_{i=0}^{n} (x_i – \bar{x})^2} | X=x_i$)
= $\frac{\sum_{i=0}^{n} (x_i – \bar{x})^2 Var(y_i | X=x_i)}{[\sum_{i=0}^{n} (x_i – \bar{x})^2]^2}$

Since, notice that $Var(y_i | X=x_i) = Var(e_i | X=x_i) = \sigma^2$
So, we have that $\frac{\sum_{i=0}^{n} (x_i – \bar{x})^2 Var(y_i | X=x_i)}{[\sum_{i=0}^{n} (x_i – \bar{x})^2]^2}$
= $\frac{\sum_{i=0}^{n} (x_i – \bar{x})^2 \cdot \sigma^2}{[\sum_{i=0}^{n} (x_i – \bar{x})^2]^2}$
= $\sigma^2 \cdot \frac{\sum_{i=0}^{n} (x_i – \bar{x})^2}{[\sum_{i=0}^{n} (x_i – \bar{x})^2]^2}$
= $\frac{\sigma^2}{\sum_{i=0}^{n} (x_i – \bar{x})^2}$.

$\mathbf{I~am~just~really~confuse~on~why~the~\text{Var($\hat{\beta_1}~|~X=x_i$)} = \text{$\frac{\sigma^2}{\sum_{i=1}^{n} x_i^2}$}}$.

Thanks for helping me out !!

Best Answer

Your calculations are correct. The issue with your answer is that you did not take into account that $\beta_0$ is known, instead you used the result from ordinary least squares in which you do not know $\beta_0$ nor $\beta_1$. Since here $\beta_0$ is known, you should not minimize with respect to $\beta_0$ but only $\beta_1$. In other words, you are effectively running the regression:

$$\tilde{Y}_i = X_i \beta_1 + e_i,$$ where $\tilde{Y}_i = Y_i - \beta_0$. So least squares amount to:

$$ \hat{\beta}_1 = \text{argmin } \sum_i (\tilde{Y}_i - X_i \beta_1)^2 $$


$$ \hat{\beta}_1 = \frac{\sum_i \tilde{Y}_i X_i}{\sum_i X_i^2}$$

Now exactly the argument you used yields that:

$$ \text{Var}(\hat{\beta}_1 \mid X) = \frac{\sigma^2}{\sum_{i=1}^n X_i^2}$$