$Var[b]$ in multiple regression

linear regressionregression

Assume a linear regression model $y=X\beta + \epsilon$ with $\epsilon \sim N(0,\sigma^2I)$ and $\hat y=Xb$ where $b=(X'X)^{-1}X'y$. Besides $H=X(X'X)^{-1}X'$ is the linear projection from the response space to the span of $X$, i.e., $\hat y=Hy$.

Now I want to calculate $Var[b]$ but what I get is an $k\times k$ matrix, not an $n \times n$ one. Here's my calculation:

\begin{align}
Var[b] =&\; Var[(X'X)^{-1}X'y]\\
=& \;(X'X)^{-1}X'\,\underbrace{Var[y]}_{= \sigma^2I}X(X'X)^{-1}\\\\
\text{Here you can }& \text{see already this thing will be k $\times$ k} \\\\
=&\; \sigma^2 \underbrace{(X'X)^{-1}X'X}_{I}(X'X)^{-1}\\
=& \sigma^2(X'X)^{-1}\, \in R^{k\times k}
\end{align}

What am I doing wrong?

Besides, are $E[b]=\beta$, $E[\hat y]=HX\beta$, $Var[\hat y]=\sigma^2H$, $E[y-\hat y]=(I-H)X\beta$, $Var[y-\hat y]=(I-H)\sigma^2$ correct (this is just on a side note, my main question is the one above)?

Best Answer

The covariance matrix for $b$ (the estimator for $\beta$) should be $k\times k$. If the $X$ matrix is $n\times k$ then $\beta$ has to be $k\times 1$; otherwise the product $X\beta$ wouldn't be $n\times 1$.

So if $\beta$ is a constant vector of $k$ parameters, then its estimator $b$ is a random vector with $k$ elements. Therefore the covariance matrix for $b$ consists of covariances for all possible combinations of two members selected from the random vector, hence it must be a $k\times k$ matrix.

To answer your side notes, all your calculations are correct but some can be simplified further. Check that $HX=X$, so that $E[\hat y]=X\beta$, and $E[y-\hat y]=0$.

Related Question