Difference of two orthogonal projections is orthogonal projection

linear algebraprojection-matricespseudoinverse

Premise: I have an $n × q$ matrix $X$ and a $q × a$ matrix $C$ with $n > q > a$.

I'm interested in the structure of the matrix
$$
M = X X^+ – X_0 X_0^+
$$

where the superscript $^+$ indicates the Moore–Penrose pseudoinverse and
$$
X_0 = X (I_q – C C^+).
$$

I assume that $X$ is of full column rank and therefore $X^+ = (X' X)^{-1} X$ (where $'$ indicates the transpose).


Background: $X$ is the design matrix of a linear model, $C$ is a contrast, $X_0$ is a reduced design matrix, and $M$ occurs in the definition of standard test statistics.

$M$ is the difference of two orthogonal projection matrices, where the second projects into a subspace of the subspace the first projects into. This makes the difference an orthogonal projection matrix itself (symmetric and idempotent), which means it has a representation
$$
M = X_\Delta X_\Delta^+.
$$

Question: How do I obtain $X_\Delta$?


user1551 has correctly pointed out in an answer that $X_\Delta = M$ itself fulfills the equation. However, I'm looking for a "version" of $X$, meaning an $n \times q$ matrix of rank $a$.

My approach: I am guessing that
$$
X_\Delta = X – X_0 X_0^+ X,
$$

and this seems to be confirmed by numerical tests. But I am unable to come up with a proof, i.e. to show that
$$
(X – X_0 X_0^+ X) (X – X_0 X_0^+ X)^+ = X X^+ – X_0 X_0^+.
$$

The problem is how to deal with the pseudoinverse of a difference. One can write
$$
X_\Delta = (I_n – X_0 X_0^+) X,
$$

and according to Wikipedia, in the pseudoinverse of a product where one factor is an orthogonal projection, the orthogonal projection can be redundantly multiplied to the opposite side, meaning here
$$
X_\Delta^+ = [(I_n – X_0 X_0^+) X]^+ = [(I_n – X_0 X_0^+) X]^+ (I_n – X_0 X_0^+) = X_\Delta^+ (I_n – X_0 X_0^+),
$$

but that doesn't seem to help.

I can prove that $M$ is symmetric and idempotent, using the relations
$$
X X^+ X_0 = X_0
\quad \text{and} \quad
X_0 X_0^+ X X^+ = X_0 X_0^+,
$$

which derive from the definition of $X_0$ and the properties of the pseudoinverse. I can also show that
$$
X X_0^+ = X_0 X_0^+
$$

using the property of the pseudoinverse of a product involving an orthogonal projection (see above). But none of that helps either.

Best Answer

With your choice of $X_\Delta$, $M$ is indeed equal to $X_\Delta X_\Delta^+$.

Proof. Let $P=I-CC^+$. Note that the column space of $M=XX^+-(XP)(XP)^+$ is $\operatorname{col}(X)\cap\operatorname{col}(XP)^\perp$, while the column space of $X_\Delta X_\Delta^+$ is precisely the column space of $X_\Delta=\left[I-(XP)(XP)^+\right]X$.

Since $X_\Delta=X\left[I-P(XP)^+X\right]$, $\operatorname{col}(X_\Delta)\subseteq\operatorname{col}(X)$. Also, since \begin{aligned} (XP)^TX_\Delta &=(X_\Delta^TXP)^T\\ &=\left[X^T\left(I-(XP)(XP)^+\right)XP\right]^T\\ &=\left[X^T\left(XP-(XP)(XP)^+(XP)\right)\right]^T\\ &=\left[X^T\left(XP-XP\right)\right]^T=0, \end{aligned} we also have $\operatorname{col}(X_\Delta)\subseteq\operatorname{col}(XP)^\perp$. Thus $\operatorname{col}(X_\Delta)\subseteq\operatorname{col}(M)$.

We now show that the reverse inclusion is also true. Pick any $v\in\operatorname{col}(M)=\operatorname{col}(X)\cap\operatorname{col}(XP)^\perp$. Since $v\in\operatorname{col}(X)$, it can be written as $Xb$ for some vector $b$. Thus $$ X_\Delta b=\left[I-(XP)(XP)^+\right]Xb=v-(XP)(XP)^+v. $$ However, we also have $v\in\operatorname{col}(XP)^\perp$. Therefore $(XP)(XP)^+v=0$ and in turn $X_\Delta b=v$, meaning that $v\in\operatorname{col}(X_\Delta)$.

Thus $\operatorname{col}(X_\Delta X_\Delta^+)\equiv\operatorname{col}(X_\Delta)=\operatorname{col}(M)$. Hence $X_\Delta X_\Delta^+$ and $M$ must be equal, because they are orthogonal projections with identical column spaces.