Linear Algebra – How to Prove SSE and SSR Are Independent

linear algebraregressionregression analysisstatistical-inferencestatistics

Consider $Y=X\beta+\varepsilon$, where $X$ is n by p, $\beta$ is p by 1 and $\varepsilon$ is n by 1 with covariance matrix = var($\varepsilon$)=$\sigma^2 I$.

Give expression for the regression and error sums of squares, find their expected values, and show that they are independent.

My work:
One has $SSE=Y^{T}Y-\hat{\beta}^{T}X^{T}Y=Y^{T}(I-X(X^{T}X)^{-1}X^{T})Y$, and $SSR=Y^{T}(X(X^{T}X)^{-1}X^{T}-\frac{1}{n}J)Y$. For $SSE$, it is easy to get its distribution. But I have difficulty to get the distribution of $SSR$, I was trying to prove $X(X^{T}X)^{-1}X^{T}-\frac{1}{n}J$ is idempotent. But it seems not easy for me.

Also I have difficulty to prove $(I-X(X^{T}X)^{-1}X^{T})(X(X^{T}X)^{-1}X^{T}-\frac{1}{n}J)=0$. Can someone help me here?

Best Answer

Anyway here is a series of results that'll help you understand what's going on (I am assuming Euclidean norm i.e. $\langle u,v\rangle=u'v$)

Definition: Let $V\subset \mathbb{R}^n$ be a sub-space. $P_{n\times n}$ is a projection onto $V$ if

1) $\forall x\in V$ we have $P.x=x$ and

2) $\forall x\in V^{\perp}$ we have $P.x=0$

Theorem 1: If $P_{n\times n}$ is the projection onto $V\subset \mathbb{R}^n$ $\iff$ $P$ is idempotent symmetric and $\mathscr{C}(P)=V$.

Theorem 2: Let $o_1,...,o_r$ be any orthonormal basis of $V$(of course with rank $r\leq n$). Then the projection matrix onto $V$ is $P=OO'$ where $O=[o_1,...,o_r]_{n\times r}$

Theorem 3: Suppose you are given a $n\times p$ matrix $X$ with rank $p$. Then $X(X'X)^{-1}X'$( say $P_X$) is the projection onto $\mathscr{C}(P)=\mathscr{C}(X)$. Also by the prior theorem if you can find $p$-many orthonormal basis (since columns of $X$ are independent you can always do Gram-Schmidt) then $P_X$ will also be equal to $OO'$ (since Projection onto a given space is unique).

Theorem 4: $Y\sim N_p(\mu,I)$ and $P$ be any projection matrix then $Y'PY \sim \chi^2_{\operatorname{rank}(P)}(\frac{1}{2}\mu'P\mu)$ i.e. with $d.f.=\operatorname{rank}(P)$ and non-centrality parameter $\frac{1}{2}\mu'P\mu$

Proof: It's not hard at all. Since $P$ is symmetric it can be written in Spectral Decomposition form i.e. $P=\Gamma D \Gamma'\implies Y'DY=Y\Gamma D \Gamma'Y=Z'DZ=\sum d_i.Z_i^2$

Fact: Under normality un-correlation $\iff$ independence.

Assume $\epsilon \sim N_n(0,\sigma^2I)$. Then $Y \sim N_n(X\beta,\sigma^2I)$. In this set-up $\mu=X\beta\in\mathscr{C}(X)$. Hence $SSE=Y'(I-P_X)Y \sim \chi^2_{n-p}(0)$

[Important: $I-P_X$ is the Orthogonal Projection of $\mathscr{C}(X)$ i.e. if you take a vector $v$ from $\mathscr{C}(X)$ (which is indeed of the form $X.u$) then $(I-P_X)v=0$. So the non-centrality parameter $\frac{1}{2}\mu'(I-P_X)\mu=\frac{1}{2}X\beta'(I-P_X) X\beta=0$.]

For SSR the matrix involved $P_X-\frac{J}{n}$ is not necessarily a projection matrix! But if you have an intercept term in your regression then it will be i.e. the model looks like $y_i=\beta_1+\sum x_{ij}.\beta_j$ or in other words $X$ has $1$ as one of it column i.e. $1\in \mathscr{C}(X)$. If we have so, $$P_X.\mathbb{1}=1\implies \left(P_X-\frac{J}{n}\right)\left(P_X-\frac{J}{n}\right)=\cdots=P_X-\frac{J}{n}$$ i.e. idempotent! Thus we can apply Theorem 4.

For independence of SSE and SSR use the fact that $(I-P_X)(P_X-\frac{J}{n})=0$