[Math] How to prove that $\hat\sigma^2$ has $\chi^2_{n-p}$ distribution (linear regression)

statistical-inferencestatistics

Consider the linear regression model:

$$Y_i=r(x_i)+\varepsilon_i\equiv\sum_{j = 1}^p x_{ij} \beta _j + \varepsilon _i,\quad i=1,\ldots,n.$$

where $x_1,\ldots,x_n\in \mathbb{R}^p$ are fixed, $E(\varepsilon_i)=0$,
$\operatorname{Var}(\varepsilon_i)=\sigma^2$. Denote $Y=(Y_1,\ldots,Y_n)^T$, $\beta=(\beta_1,\ldots,\beta_p)^T$, $X=(x_{ij})_{n\times p}$. As is known, $\hat \beta = \arg\min\limits_{\beta \in \mathbb{R}^p} (Y – X\beta)^T (Y – X\beta)=(X^TX)^{-1}X^TY$ if the matrix $X^TX$ is invertible, and so an estimator of $r(x)$ at $x=(x_1,\ldots,x_p)\in\mathbb{R}^p$ is given by $\hat r_n(x)=x^T\hat\beta$.

Let
$$\hat{\sigma}^2 = \frac{1}{\sigma^2}\sum_{i = 1}^n (Y_i – {\hat r}_n (x_i))^2.$$
I am stucking the problem: $\hat\sigma^2$ has $\chi^2_{n-p}$ distribution. How to prove the statement?

Best Answer

In the linear model with normal distributed errors you know that $$Y \sim \mathcal{N}_n(X\beta, \sigma^2I_n)$$ with $I_n$ is the identity matrix. Also you know that $$\sum_{i=1}^n\left(Y_i - \hat{r}(x_i)\right) = \|Y-X\hat{\beta}\|_2^2 = Y^TQ_LY$$ where $Q_L$ is the orthogonal projection into $L^\bot$ where $L := \{ X \beta : \forall\beta \in \mathbb{R}^p\}$. Note that $Q_L$ is a projection matrix.

($Q_L = I_n - P_L = I_n - X(X^TX)^{-1}X^T$)

Also it's important to take a closer look at the rank from $Q_L$. For that we know, that $P_L$ have rank $p$ if $X$ has full rank $p$. The $Q_L$ have rank $n-p$ because $I_n$ has rank $n$ and $P_L$ rank $p$ and $Q_L = I_n - P_L$ (for more details you ned linear algebra).

In the next step we take a look at $Z := \frac{1}{\sigma} Y$ which is $\mathcal{N}_n(X\beta/\sigma, I_n)$-distributed.

Now I use that for a $\mathcal{N}_n(\mu,I_n)$ - distributed random vector $X$ and a $n \times n$ - projection matrix $P$ with rank $n$ the expression $$X^TPX$$ is $\chi^2_n$ - distributed with noncentrality parameter $$\delta^2 = \mu^TP\mu$$

With that you can easy verify, that \begin{align*} \hat{\sigma}^2 &= \frac{1}{\sigma^2} Y^TQ_LY = (\frac{1}{\sigma}Y)^T Q_L (\frac{1}{\sigma}Y) = Z^TQ_LZ \end{align*} and with $\mathrm{rank}(Q_L) = n-p$ it follows that $$\hat{\sigma}^2 \sim \chi^2_{n-p}(\delta^2)$$ with noncentrality parameter $\delta^2 = \mu^TQ_L\mu$. At last we have to show that $\delta^2 = 0$.

In our case $\mu = X\beta / \sigma$. The vector $\beta \cdot \frac{1}{\sigma}$ is a vector in $\mathbb{R}^p$ and therefore $\mu \in L$. But when $\mu$ is in $L$ the product $Q_L \mu = 0$ because $\mu \bot L^\bot$ and with that $\delta^2 = 0$.

All in all $$\hat{\sigma}^2 \sim \chi^2_{n-p}$$

I hope you get it with that. To understand the proof in every detail you have to go very deep in probabilty theory and linear algebra (especially into projecitons).