Oblique projection onto an intersection along to a sum of vector spaces.

linear algebralinear-transformationsprojection-matrices

(Sorry for the long post, this problem is a head scratcher.)

Definition. Assume $\mathbf{R}^d = V \oplus W,$ in other words, we assume $\mathbf{R}^d$ is the direct sum of two of its subspaces, and by direct sum I mean the set of sums $v + w$ such that $v \in V$ and $w \in W$ with the further assumption that $V \cap W = \{0\}.$ By definition of direct sum, every $x \in \mathbf{R}^d$ is uniquely decomposable as $x = y + z$ with $y \in V$ and $z \in W.$ The function $x \mapsto y$ defined in this way is linear, we call it the "projector onto $V$ along $W$'' and denote it as $P_{V \mid W}$ should the need arise.

Geometric interpretation. Suppose in $\mathbf{R}^2$ we have two vectors $v$ and $w,$ which are linearly independent. Then, $P_{v \mid w}(a v + b w) = a v,$ which is a vector parallel to $v.$ One way to think about this is that the sum $av + bw$ can be thought of as $av$ emerging from $b w$ instead of the origin, the projection moves this vector along the subspace $\langle w \rangle$ until it reaches $av$ emerging from the origin, which is a vector parallel to $v.$ In general, the projection $P_{V \mid W}$ is parallel to $V$ (but it need not be orthogonal to W!).

I add the following as I suspect this is the definition most people is used to:

Basic characterisation. For a linear function $\mathbf{R}^d \to \mathbf{R}^d$ to be projector (onto some subspace along some other) it is necessary and sufficient that such linear function be idempotent. When this is the case, the linear function is the projector onto its image along its kernel, a fortiori the image of a projector is the space it is projecting onto and its kernel is the space it is projecting along. In symbols: the image of $P_{V \mid W}$ is $V$ and its kernel is $W.$

Tautological characterisation. Suppose $\mathbf{R}^d = V \oplus W$ and $T:\mathbf{R}^d \to \mathbf{R}^d$ is a linear function satisfying $T(x) = x$ for all $x \in V$ and $T(x) = 0$ for all $x \in W.$ Then, $T = P_{V \mid W}.$

Some discussion. The following "theorem" appears as stated (i.e. as an if and only if) in a book on statistics ("The Foundations of Multivariate Analysis. A Unified Approach by Means of Projections Onto Linear Subspaces" by Takeuchi et al). However, the authors only prove the sufficiency of the theorem without giving any comment on the necessity. Later, when dealing with orthogonal projections (i.e. when $W = V^\perp$), they restate the theorem, comment that the sufficiency was proven in general and finally prove the necessity for the orthogonal case. This suggests to me that the necessity of the following theorem may be false. However, every example I have constructed does satisfy the necessity.

Me question is how to prove the necessity of the following theorem in the generality stated. I have posted below proofs under additional hypotheses.

Theorem. Suppose $\mathbf{R}^d = V_1 \oplus W_1 = V_2 \oplus W_2 = (V_1 \cap V_2) \oplus (W_1 + W_2).$ Let $P_i$ denote the projector onto $V_i$ along $W_i.$ Let $P = P_{V_1 \cap V_2 \mid W_1 + W_2}$ denote the projector onto $V_1 \cap V_2$ along $W_1 + W_2.$ A necessary and sufficient condition for $P = P_1P_2$ is that $P_1P_2 = P_2P_1.$

Proof of sufficiency. Since $P_1P_2 = P_2P_1,$ we see that
$$
P^2 = P_1P_2P_1P_2=P_1^2P_2^2 = P_1P_2 = P,
$$
so $P$ is the projector onto its image along its kernel. Clearly, the image of $P$ is subset $V_1 \cap V_2$ (since $V_i$ is the image of $P_i,$ $\mathsf{im}(P) \subset V_i$ for $i=1,2$). If now $x \in V_1 \cap V_2,$ then $P_1(x) = P_2(x) = x$ (since $P_i$ is the identity on $V_i$) which shows $P_1P_2(x) = P_1^2(x) = P_1(x) = x,$ so $x$ belongs to the image of $P_1P_2.$ Therefore, the image of $P_1P_2$ is $V_1 \cap V_2.$ Now, we will show the kernel of $P_1P_2$ is $W_1 + W_2,$ this will finish the sufficiency. In fact, we already know $W_i$ is the kernel of $P_i,$ and by the assumption $P_1P_2 = P_2P_1,$ we quickly reach $W_1 + W_2 \subset \mathsf{ker}(P_1P_2).$ Reciprocally, suppose $P_1P_2(x) = 0,$ decompose $x = y_2 + z_2$ with $y_2 \in V_2$ and $z_2 \in W_2,$ then $P_1P_2(x) = P_1(y_2) = 0,$ that means $y_2 \in W_1,$ so $x \in W_1 + W_2.$

Idea of proof of necessity. We are assuming that $P = P_1P_2$ is the projector onto $V_1 \cap V_2$ along $W_1 + W_2.$ We decompose
$$
\begin{array}{rccl}
x &= &y + z &\in (V_1 \cap V_2) \oplus (W_1 + W_2) \\
&= &y_1 + z_1 &\in V_1 \oplus W_1 \\
&= &y_2 + z_2 &\in V_2 \oplus W_2,
\end{array}
$$

the $y$s contained in the $V$s and the $z$s in the $W$s in the obvious manner. The assumptions give
$$
P_1x = P_1y_1 = y_1, \qquad P_1z_1 = 0, \qquad P_1y = y, \\
P_2x = P_2y_2 = y_2, \qquad P_2z_2 = 0, \qquad P_2y = y,
$$

and
$$
P_1P_2 x = P_1P_2 y = y \\
P_1P_2 x = P_1P_2 y_2 = P_1 y_2 = y\\
P_1P_2 x = P_1P_2 (y_1 + z_1) = P_1P_2 y_1 = y.
$$

By the tautological characterisation above, we just need to show $P_2P_1 y = y,$ which is easy given the equalities above, and $P_2P_1 z = 0,$ which I do not know how to do. Of course, one should use that we are assuming $P_1P_2$ is a projector, which makes it idempotent, namely $P_1P_2 = P_1P_2P_1P_2.$ In fact, it suffices to show that $P_2P_1 w_2 = 0$ for all $w_2 \in W_2.$

This is where I do not know how to continue, in other words, how to prove that $P_2 P_1 z = 0$? Obviously, the real interest is to prove the necessity of the theorem perhaps entirely different than this approach.

Proof of necessity if $W_i = V_i^\perp.$ We first prove every orthogonal projector is symmetric. Let $T$ be any orthogonal projector, in other words, let $T$ be a projector onto a subspace $L$ along the orthocomplement $L^\perp.$ Let $S = I – T,$ which is the projector onto $L^\perp$ along $L.$ Orthogonality implies
$$
0 = T^\intercal S = T^\intercal – T^\intercal T.
$$

By taking transposes, we see $T = T^\intercal.$ Applying this to the theorem above, we see each $P_i$ is symmetric and since $(V_1 \cap V_2)^\perp = V_1^\perp + V_2^\perp$ the projector $P = P_{V_1 \cap V_2 \mid W_1 + W_2}$ is also an orthogonal projector. Therefore, $P$ is symmetric, but then $P_1P_2 = P = P^\intercal = P_2^\intercal P_1^\intercal = P_2P_1,$ as desired. QED

Proof of necessity if $W_1 \subset V_2$ and $W_2 \subset V_1.$ Note that under these conditions, we actually have $W_1 \cap W_2 = \{0\}$ since $W_1 \cap W_2 \subset W_1 \cap V_1 = \{0\}.$ By the assumption $W_1 \subset V_2,$ we have that $P_2w_1 = w_1$ for all $w_1 \in W_1,$ similarly $P_1w_2 = w_2$ for all $w_2 \in W_2.$ Write now
$$
x = y + w_1 + w_2, \qquad y \in V_1 \cap V_2, w_i \in W_i.
$$

Then, $P_1P_2x = P_1(y+w_1) = y$ and $P_2P_1x= P_2(y+w_2) = y.$ So $P_1P_2 =P_2P_1$ and we are done.

Best Answer

I found a solution. The theorem is false as stated.

Counterexample. Let $c_1, c_2, c_3$ denote the canonical vectors of $\mathbf{R}^3$ and denote $$ \begin{array}{rclrcl} V_1 &= &\langle c_1, c_2 + c_3 \rangle, &W_1 &= &\langle c_2 \rangle, \\ V_2 &= &\langle c_1, c_2 \rangle, &W_2 &= &\langle c_3 \rangle. \end{array} $$ Then, $$ V_1 \cap V_2 = \langle c_1 \rangle, \qquad W_1 + W_2 = \langle c_2, c_3 \rangle. $$ Notice $$ \left[\begin{matrix} x \\ y \\z \end{matrix}\right] = x c_1 + z (c_2 + c_3) + (y-z) c_2 = x c_1 + y c_2 + z c_3, $$ therefore $$ P_1\left[\begin{matrix} x \\ y \\z \end{matrix}\right] = x c_1 + z (c_2 + c_3) = \left[\begin{matrix} x \\ z \\z \end{matrix}\right], \qquad P_2\left[\begin{matrix} x \\ y \\z \end{matrix}\right] = x c_1 + y c_2 = \left[\begin{matrix} x \\ y \\0 \end{matrix}\right] $$ and $$ P\left[\begin{matrix} x \\ y \\z \end{matrix}\right] = x c_1 = \left[\begin{matrix} x \\ 0 \\0 \end{matrix}\right]. $$ Then, $$ P_1P_2\left[\begin{matrix} x \\ y \\z \end{matrix}\right] = \left[\begin{matrix} x \\ 0 \\0 \end{matrix}\right] = P\left[\begin{matrix} x \\ y \\z \end{matrix}\right], $$ but $$ P_2P_1\left[\begin{matrix} x \\ y \\z \end{matrix}\right] = \left[\begin{matrix} x \\ z \\0 \end{matrix}\right]. $$

What is going on? Geometrically, every projector $P_{V \mid W}$ is diagonalisable since it is one times the identity on $V$ and it is zero times the identity on $W.$ A well-known necessary and sufficient condition for two diagonalisable linear functions to commute is that they should be simultaneously diagonalisable. This means that for $P_1$ and $P_2$ to commute, $P_2$ must act as the identy or as zero for every vector in $V_1$ and in $W_1$ and vice versa. The vector spaces $V_i$ and $W_i$ can then be constructed so that the "right cancellations" occur for only $P_1P_2$ to give $P$ but not for $P_2P_1.$ In fact, these considerations gave me a hint to prove a true characterisation.

Theorem. Suppose $\mathbf{R}^d = V_1 \oplus W_1 = V_2 \oplus W_2 =(V_1 \cap V_2) \oplus (W_1 + W_2).$ Let $P_i$ denote the projector onto $V_i$ along $W_i.$ Let $P=P_{V_1 \cap V_2 \mid W_1 + W_2}$ denote the projector onto $V_1 \cap V_2$ along $W_1 + W_2.$ A necessary and sufficient condition for $P = P_1 P_2$ is that $V_2 = (V_2 \cap V_1) \oplus (V_2 \cap W_1).$

Proof. Suppose first $P = P_1 P_2.$ For $x \in V_2$ we have $x = v_1 + w_1$ with $v_1 \in V_1$ and $w_1 \in W_1.$ Since $v_1 = P_1 x = P_1 P_2 x = P x \in V_1 \cap V_2,$ we have $w_1 = x - v_1 \in V_2,$ so $w_1 \in V_2 \cap W_1.$ Therefore, $V_2 = (V_2 \cap V_1) \oplus (V_2 \cap W_1).$ Reciprocally, assume that $V_2$ has the stated decomposition, so $x\in V_2$ can be uniquely written as $x = v + w$ with $v \in V_2 \cap V_1$ and $w \in V_2 \cap W_1.$ This is therefore the unique decomposition of $x \in V_1 \oplus W_1,$ and we see $P_1P_2x = P_1x = v = Px$ (true for all $x \in V_2$). For $x \in W_2,$ we clearly have $P_1P_2 x = P_10 = 0$ and $Px = 0.$ Therefore, $P_1P_2 = P.$ QED

With the previous theorem, it is now obvious that $P = P_1 P_2 = P_2 P_1$ if and only if $V_1 = (V_1 \cap V_2) \oplus (V_1 \cap W_2)$ and $V_2 = (V_2 \cap V_1) \oplus (V_2 \cap W_1).$ In the counterexample above, $V_2 = (V_2 \cap V_1) \oplus (V_2 \cap W_2)$ but $V_1 \neq (V_1 \cap V_2) \oplus (V_1 \cap W_2).$

Related Question