[Math] If $Y=X\beta+\epsilon$, prove that the least square estimator $\hat\beta$ is independent of $Y-X\hat{\beta}$

linear algebralinear regressionnormal distributionstatistics

Let $Y=X\beta+\epsilon$, where $Y$ is an $n$ by $1$ vector, $X$ is an $n$ by $p$ matrix with full rank and $\epsilon$ is an $n$ by 1 vector of random errors independently and normally distribution with mean vector $0$ and variance-covariance matrix $\Sigma=\sigma^2 I$, with $0$ being an $n$ by $1$ vector of zeros and $I$ being the $n$ by $n$ identity matrix. Prove the least square estimator $\hat{\beta}$ and $Y-X\hat{\beta}$ are independent vectors.

I already get $\hat{\beta}=(X^TX)^{-1}X^{T}Y$. But I tried to prove independence, but I cannot.

Best Answer

With $\hat\beta:=(X^TX)^{-1}X^TY$ as the vector of estimated parameters and $r:=Y-X\hat\beta$ the vector of residuals, introduce the hat matrix $ H:=X(X^TX)^{-1}X^T$. Check that $X\hat\beta=HY$ and $X=HX$, so $$r:=Y-X\hat\beta=Y-HY=(I-H)Y=(I-H)(X\beta +\epsilon)=(I-H)\epsilon.\tag1 $$ In particular $E( r)=(I-H)E(\epsilon)=0$, so to show $r$ and $\hat\beta$ are uncorrelated it's enough to show $E(r\hat\beta^T)=0$. Indeed: $$ \begin{align} E(r\hat\beta^T)&=E\left(\left[(I-H)\epsilon\right]\left[(X^TX)^{-1}X^TY\right]^T\right)\\ &\stackrel{(a)}=E\left((I-H)\epsilon Y^TX(X^TX)^{-1}\right)\\ &\stackrel{(b)}=(I-H)E(\epsilon Y^T) X(X^TX)^{-1}.\tag2 \end{align} $$ In (a) we use the fact that $(X^TX)^{-1}$ is symmetric, so is its own transpose; in (b) we see that everything except $\epsilon Y^T$ is non-stochastic. The expectation of $\epsilon Y^T$ is $$E(\epsilon Y^T)=E(\epsilon(\epsilon^T+\beta^TX^T)=E(\epsilon\epsilon^T)=:\Sigma=\sigma^2I,\tag3$$ so the RHS of (2) equals $\sigma^2\underbrace{(I-H)X}_{0}(X^TX)^{-1} = 0$.