Solved – What exactly should be called “projection matrix” in the context of PCA

linear algebrapcaterminology

At the end of the PCA algorithm one gets a $D\times d$ matrix $U$ such that $z=U^Tx$ (here $x$ is $D$-dimensional and $z$ is $d$ dimensional with $d\leq D$). In multiple sources on the Web I found that $U$ or $U^T$ is named "projection matrix", but according to Wikipedia it cannot be since $U$ is usually not a square matrix (usually $d<D$). The true orthogonal matrix is $UU^T$, which is also named "projection matrix" across the Web.

Maybe it is correct to say that $U$ or $U^T$ is the projection matrix and $UU^T$ is the orthogonal projection matrix?

There is not a clear definition or am I missing something? Are there more appropriate names for those matrices in order to not confuse them?

Best Answer

$U U^T$ is the projection operator. I believe calling $z = U^T x$ a projection is an abuse of terminology. It is actually a coordinate transformation, where each value in the new coordinates is computed by a scalar projection onto the principal vectors. $U U^T x$ is the orthogonal projection of the data onto the principal vectors in the original coordinates, and $U^T x$ is the orthogonal projection of the data onto the principal vectors in the new coordinates (principal components).