Regression – General Linear Hypothesis Test Statistic: Equivalence of Two Expressions Explained

anovalinear modelregression

Assume a general linear model $y = X \beta + \epsilon$ with observations in an $n$-vector $y$, a $(n \times p)$-design matrix $X$ of rank $p$ for $p$ parameters in a $p$-vector $\beta$. A general linear hypothesis (GLH) about $q$ of these parameters ($q < p$) can be written as $\psi = C \beta$, where $C$ is a $(q \times p)$ matrix. An example for a GLH is the one-way ANOVA hypothesis where $C \beta = 0$ under the null.

The GLH-test uses a restricted model with design matrix $X_{r}$ where the $q$ parameters are set to 0, and the corresponding $q$ columns of $X$ are removed. The unrestricted model with design matrix $X_{u}$ makes no restrictions, and thus contains $q$ free parameters more – its parameters are a superset of those from the restricted model, and the columns of $X_{u}$ are a superset of those from $X_{r}$.

$P_{u} = X_{u}'(X_{u}'X_{u})^{-1} X'$ is the orthogonal projection onto subspace $V_{u}$ spanned by $X_{u}$, and analogously $P_{r}$ onto $V_{r}$. Then $V_{r} \subset V_{u}$. The parameter estimates of a model are $\hat{\beta} = X^{+} y = (X'X)^{-1} X' y$, the predictions are $\hat{y} = P y$, the residuals are $(I-P)y$, the sum of squared residuals SSE is $||e||^{2} = e'e = y'(I-P)y$, and the estimate for $\psi$ is $\hat{\psi} = C \hat{\beta}$. The difference $SSE_{r} – SSE_{u}$ is $y'(P_{u}-P_{r})y$. Now the univariate $F$ test statistic for a GLH that is familiar (and understandable) to me is:
$$
F = \frac{(SSE_{r} – SSE_{u}) / q}{\hat{\sigma}^{2}} = \frac{y' (P_{u} – P_{r}) y / q}{y^{t} (I – P_{u}) y / (n – p)}
$$

There's an equivalent form that I don't yet understand:
$$
F = \frac{(C \hat{\beta})' (C(X'X)^{-1}C')^{-1} (C \hat{\beta}) / q}{\hat{\sigma}^{2}}
$$

As a start
$$
\begin{array}{rcl}
(C \hat{\beta})' (C(X'X)^{-1}C')^{-1} (C \hat{\beta}) &=& (C (X'X)^{-1} X' y)' (C(X'X)^{-1}C')^{-1} (C (X'X)^{-1} X' y) \\
~ &=& y' X (X'X)^{-1} C' (C(X'X)^{-1}C')^{-1} C (X'X)^{-1} X' y
\end{array}
$$

  • How do I see that $P_{u} – P_{r} = X (X'X)^{-1} C' (C(X'X)^{-1}C')^{-1} C (X'X)^{-1} X'$?
  • What is the explanation for / motivation behind the numerator of the 2nd test statistic? – I can see that $C(X'X)^{-1}C'$ is $V(C \hat{\beta}) / \sigma^{2} = (\sigma^{2} C(X'X)^{-1}C') / \sigma^{2}$, but I can't put these pieces together.

Best Answer

For your second question, you have $\mathbf{y}\sim N(\mathbf{X}\boldsymbol{\beta},\sigma^2 \mathbf{I})$ and suppose you're testing $\mathbf{C}\boldsymbol{\beta}=\mathbf{0}$. So, we have that (the following is all shown through matrix algebra and properties of the normal distribution -- I'm happy to walk through any of these details)

$ \mathbf{C}\hat{\boldsymbol{\beta}}\sim N(\mathbf{0}, \sigma^2 \mathbf{C(X'X)^{-1}C'}). $

And so,

$ \textrm{Cov}(\mathbf{C}\hat{\boldsymbol{\beta}})=\sigma^2 \mathbf{C(X'X)^{-1}C}. $

which leads to noting that

$ F_1 = \frac{(\mathbf{C}\hat{\boldsymbol{\beta}})'[\mathbf{C(X'X)^{-1}C'}]^{-1}\mathbf{C}\hat{\boldsymbol{\beta}}}{\sigma^2}\sim \chi^2 \left(q\right). $

You get the above result because $F_1$ is a quadratic form and by invoking a certain theorem. This theorem states that if $\mathbf{x}\sim N(\boldsymbol{\mu}, \boldsymbol{\Sigma})$, then $\mathbf{x'Ax}\sim \chi^2 (r,p)$, where $r=\textrm{rank}(A)$ and $p=\frac{1}{2}\boldsymbol{\mu}'\mathbf{A}\boldsymbol{\mu}$, iff $\mathbf{A}\boldsymbol{\Sigma}$ is idempotent. [The proof of this theorem is a bit long and tedious, but it's doable. Hint: use the moment generating function of $\mathbf{x'Ax}$].

So, since $\mathbf{C}\hat{\boldsymbol{\beta}}$ is normally distributed, and the numerator of $F_1$ is a quadratic form involving $\mathbf{C}\hat{\boldsymbol{\beta}}$, we can use the above theorem (after proving the idempotent part).

Then,

$ F_2 = \frac{\mathbf{y}'[\mathbf{I} - \mathbf{X(X'X)^{-1}X'}]\mathbf{y}}{\sigma^2}\sim \chi^2(n-p-1) $

Through some tedious details, you can show that $F_1$ and $F_2$ are independent. And from there you should be able to justify your second $F$ statistic.

Related Question