Decompose projection matrix into a matrix and its pseudoinverse

matricesmatrix decompositionprojection-matrices

While researching regression, I encountered the situation where I am trying to reconstruct a data matrix $X \in \mathbb{R}^{n\times p}$ from the $n \times n$ hat matrix $H = X(X'X)^{-1}X' = XX^{+}$ (where $A^{+}$ is the Moore-Penrose inverse of $A$).

Note that I only have access to the hat matrix, not the original data $X$.

So I have only $H$, but I do know the dimensions ($n$ and $p$) of the $X$ matrix, and I know that it is full column rank (i.e., $n > p$ and $\text{rk}(X) = p$). I know about $H$ that it is symmetric, idempotent, and positive semidefinite (it is a projection matrix).

Is there a way to decompose the matrix $H \to XX^{+}$ in this context? Any approximate solutions where I could define the approximation error would also be very much appreciated.

Some steps I already went through:
$$H = XX^+$$
$$HX = X$$
$$(I_n – H)X = 0$$

and from there I realised that $X$ has something to do with the nullspace of $(I_n – H)$.

Best Answer

Since $H$ is an orthogonal projection (symmetric and idempotent), its eigenvalues are 1 and 0 with the corresponding orthogonal eigenvectors spanning the range and nullspace, respectively, of $H$. If $$ H=VDV^T, \quad D=\mathrm{diag}(I_p, 0_q), V = [V_p,V_q], $$ with $q=n-p$ is the eigen-decomposition of $H$, then $$ H=V_pV_p^T=V_pV_p^+. $$ You can set $X:=V_p$.

Note that the choice of $X$ is not unique and any other choice is of the form $V_pM_p$, where $M_p$ is a $p\times p$ nonsingular matrix. On the other hand, $V_p$ has orthonormal columns and is hence numerically appealing.

EDIT

Since we are interested in computing the range of $H$, any method useful to find a linearly independent set of vectors could be used. My favorite one is the QR factorization with column pivoting. It results in an orthogonal matrix $Q$, upper triangular $R$, and permutation matrix $\Pi$ such that $$ H\Pi=QR=[Q_p,Q_q]\begin{bmatrix}R_{11}&R_{12}\\0&R_{22}\end{bmatrix}, $$ where the $p\times p$ block $R_{11}$ is nonsingular and the $q\times q$ block $R_{22}$ is equal to zero. The $n\times p$ matrix $Q_p$ then holds an orthonormal basis of the range of $H$.

In practice, $R_{22}$ won't be exactly zero and a reasonable "tolerance" has to be chosen to determine when the trailing block of $R$ is "zero enough".

The QR factorization should be faster than the eigen-decomposition, although in terms of operations, they are generally both $O(n^3)$ algorithms. Some savings could be done by realizing that we need only the matrix $Q_p$. We could consider a custom implementation of the QR factorization with column pivoting (I'm not aware though where to find one already done) which stops as soon as the magnitude of the corresponding diagonal element of $R$ at a given step gets below a certain tolerance (usually, a "reasonable" multiple of the machine precision). This should reduce the cost to something like $O(n^2p)$.

You can find the description of the algorithm here.

Related Question