I want to prove that for symmetric, idempotent matrices $H_1$ and $H_0$ (these are "hat matrices" of a linear regression model), $(H_1-H_0)=(H_1-H_0)^2$, in order to show a property of a distribution. So far, I've only found that the square equals $H_1-2H_0H_1+H_0$, where I have used the symmetry to change the order of matrix multiplication. But, would this not mean that I need to have $H_0-2H_0H_1=-H_0\iff H_0=H_0H_1\iff H_1=I$, which means that the matrix is, in fact, not idempotent, since $H_1$ is not necessarily equal to $I$? Thank you for your time.
Proving that $H_1-H_0$ is idempotent
idempotentslinear algebramatricesstatistics
Related Solutions
I believe you’re asking for the intuition behind those three properties of the hat matrix, so I’ll try to rely on intuition alone and use as little math and higher level linear algebra concepts as possible.
Preliminaries
Start with the fact that the projection matrix $P$ allows you to obtain the orthogonal projection of an arbitrary vector onto the column space of X. Let’s use $v_p$ for the orthogonal projection of $v$: $$ P v = v_p $$ You can use $P$ to decompose any vector $v$ into two components that are orthogonal to each other. Think of $v_n$ as what is "left over" after the rest of $v$ is projected onto the column space of X, so it is orthogonal to the column space of X (and any vector in the column space of X). $$ v = v_p + v_n $$ $$ v_p \perp v_n $$
1. Why does P * P = P?
Intuitively, projecting a vector onto a subspace twice in a row has the same effect as projecting it onto that subspace once. The second projection has no effect because the vector is already in the subspace from the first projection.
Less intuitive
If that isn’t intuitive, it may be easier to consider the equivalent question: why does $P * P v= P v$ for any arbitrary vector v?
Start by simplifying the left hand side: $$ P * (P v) = P v_p $$ since $P v = v_p$.
Next consider $ P v_p $, which (by definition of P) projects $v_p$ onto the column space of X. This has no effect since $v_p$ is already entirely in the column space of X. Therefore
$$
P v_p = v_p
$$
Since $v_p = P v$, we conclude:
$$
P v_p = P v
$$
Chaining all these equations together gives:
$$
P * P v= P v
$$
2. Why is P symmetric?
Intuitively, consider two arbitrary vectors $v$ and $w$. Take the dot product of one vector with the projection of the other vector. $$ (P v) \cdot w $$ $$ v \cdot (P w) $$
In both dot products, one term ($P v$ or $P w$) lies entirely in the ‘projected space’ (column space of X), so both dot products ignore everything that is not in the column space of X. This means both dot products are equal. Some simple dot product identities then imply that $P = P^T$, so $P$ is symmetric.
Less intuitive
If that isn't intuitive, we first prove that both dot products are equal. Decompose $v$ and $w$ as shown in the preliminaries above. $$ v = v_p + v_n $$ $$ w = w_p + w_n $$ The projection of a vector lies in a subspace. The dot product of anything in this subspace with anything orthogonal to this subspace is zero. We use this fact on the dot product of one vector with the projection of the other vector: $$ (P v) \cdot w \hspace{1cm} v \cdot (P w) $$ $$ v_p \cdot w \hspace{1cm} v \cdot w_p $$ $$ v_p \cdot (w_p + w_n) \hspace{1cm} (v_p + v_n) \cdot w_p $$ $$ v_p \cdot w_p + v_p \cdot w_n \hspace{1cm} v_p \cdot w_p + v_n \cdot w_p $$ $$ v_p \cdot w_p \hspace{1cm} v_p \cdot w_p $$ Therefore $$ (Pv) \cdot w = v \cdot (Pw) $$ Next, we can show that a consequence of this equality is that the projection matrix P must be symmetric. Here we begin by expressing the dot product in terms of transposes and matrix multiplication (using the identity $x \cdot y = x^T y$ ): $$ (P v) \cdot w = v \cdot (P w) $$ $$ (P v)^T w = v^T (P w) $$ $$ v^T P^T w = v^T P w $$ Since v and w can be any vectors, the above equality implies: $$ P^T = P $$
3. Why is P positive semidefinite?
By definition a matrix $P$ is positive semidefinite if and only if for every non-zero column vector $v$: $$ v^T P v >= 0 $$ or equivalently: $$ v \cdot (P v) >= 0 $$ Intuitively, a dot product is a projection of one vector onto another vector, and then scaling by the length of the second vector. We want to show that this dot product is non-negative. In the equation immediately above, $v \cdot (P v)$ means "project $v$ onto $P v$ and scale by $P v$". The first part of this, project $v$ onto $P v$, is equivalent to "project $v$ onto $v_p$", since $P v = v_p $.
Projecting $v$ onto $v_p$ projects $v$ onto something that lies entirely in the column space of X, so this projection is just $v_p$. Next, scaling this $v_p$ by $v_p$ squares its length. A squared length must be non-negative.
Less intuitive
If that isn't intuitive, the dot product can be simplified by decomposing $v$ into orthogonal components $$ v \cdot (P v) $$ $$ (v_p + v_n) \cdot (P v) $$ $$ (v_p + v_n) \cdot v_p $$ $$ v_p \cdot v_p + v_n \cdot v_p $$ Since $v_p$ and $v_n$ are orthogonal, the second term is zero and we have only $$ v_p \cdot v_p $$ The quantity immediately above is the length of the vector $v_p$ squared (i.e., $\|v_p\|_2^2$ ). This must be a non-negative value. $$ v_p \cdot v_p = \|v_p\|_2^2 >= 0 $$
I get the answer now. That is because $H=H^2$ and it is symmetric, so we have $h_{ii}=\sum_{j=1}^nh_{ij}h_{ji}=\sum_{j=1}^nh_{ij}^2$.
Best Answer
You need an additional hypothesis: The column space of $H_0$ is a subset of the column space of $H_1.$
If $H_0$ and $H_1$ are $n\times n$ symmetric idempotent matrices and the column space of $H_0$ is a subset of the column space of $H_1,$ then $H_0 H_1 = H_1 H_0 = H_0.$
If $x$ is in the column space of a symmetric idempotent real matrix $H,$ then $Hx=x,$ and if $x$ is orthogonal to the column space, then $Hx=0.$
If $x$ is any of the columns of $H_0$ and the aforementioned additional hypothesis holds, then $H_1 x = x.$ The columns of $H_1H_0$ are therefore just the columns of $H_0,$ so $H_1H_0= H_0.$ And since these matrices are symmetric, we also have $H_0 H_1=H_0.$
If $H_0$ had a right inverse matrix $A,$ then we could write: $$ \require{cancel} \xcancel{ \begin{align} H_1 H_0 & = H_0. \\[6pt] (H_1 H_0) A & = H_0 A = I. \\[6pt] H_1 (H_0A) & = I. \\[6pt] H_1 I & = I. \\[6pt] H_1 & = I. \end{align}} $$ But no matrix with the same number of columns as rows has a one-sided inverse unless it has a two-sided inverse, and these don't.