[Math] How to prove the estimator of $\sigma^2$ in generalized least squares is unbiased

linear algebrastatistics

The model is as follow:

$$y = X\beta + \epsilon, \epsilon \sim (0,\sigma^2 V)$$

where $X$ is $n \times (p+1)$ and $V$ is a known positive definite matrix.

Using spectral decomposition, it can be established that $V$ is positive definite if and only if there exists a nonsingular matrix $P$ such that $V = PP^T$. Multiplying $P^{-1}$ on both sides yields that

$$P^{-1}y = P^{-1} X \beta + P^{-1} \epsilon$$

I managed to derive that $\hat{\beta} = (X^T V^{-1}X)^{-1}X^TV^{-1}y$.

The estimator for $\sigma^2$ is as follow
$$\hat{\sigma^2}= \frac{(y-X\hat{\beta})^T V^{-1}(y-X\hat{\beta})}{n-(p+1)} $$
$$= \frac{y^T(V^{-1}-V^{-1}X(X^TV^{-1}X)^{-1}X^TV^{-1})y}{n-(p+1)}$$

But I failed to show that the estimator of $\sigma^2$ is unbiased

My attempt is to take expectation of $\hat{\sigma^2}$ and then use trace to show that

$$E(\hat{\sigma^2})= \frac{tr(V^{-1}-V^{-1}X(X^TV^{-1}X)^{-1}X^TV^{-1})) E(yy^T)}{n-(p+1)}$$

Then I got stuck. I do not know how to continue.

Can someone kindly help me here please?

Best Answer

There is an error in your formula when using the trace. The trace of a scalar is of course the scalar itself and because $tr(AB) = tr(BA)$ (even for non-square matrices) you have $E(y^{T}Ay) = E(tr(y^{T}Ay)) = E(tr(Ay\,{y^{T}}))$ which, if A is non-random, gives $tr(A.E(y\,{y^{T}}))$. That is to say, the trace is on the product not on A alone.

Correcting your formula we get:

$$E(\hat{\sigma}^2)= \frac{tr((V^{-1}-V^{-1}X{X^{T}V^{-1}X)^{-1}}\,{X^{T}}V^{-1}). E(y\,{y^{T}}))}{n-(p+1)}$$

Now $E(y{y^{T}}) = \sigma^2 V + X\beta\beta^{T}X^{T}$ and if you notice that

$$tr(V^{-1}X{(X^{T}V^{-1}X)^{-1}}\,{X^{T}}) = tr(({X^{T}V^{-1}X){(X^{T}V^{-1}X)^{-1}}}) = tr(I_{p+1}) = p+1$$

And that

$$tr(V^{-1}X{(X^{T}V^{-1}X)^{-1}}\,{X^{T}}V^{-1}X\beta\beta^{T}X^{T}) = tr(V^{-1}X\beta\beta^{T}X^{T}) $$

You get

$$\begin{alignat}{} tr((V^{-1}-V^{-1}X{(X^{T}V^{-1}X)^{-1}}X^{T}V^{-1}). (\sigma^2 V + X\beta\beta^{T}X^{T})) \\= \sigma^2tr(I_{n}) - \sigma^2tr(I_{p+1})\\ = \sigma^2(n - (p+1)) \end{alignat} $$

Hence, $E(\hat{\sigma}^2) = \sigma^2$ and the estimator is unbiased.

Related Question