Solved – Why are principal component scores uncorrelated

correlationlinear algebrapca

Supose $\mathbf A$ is a matrix of mean-centred data. The matrix $\mathbf S=\text{cov}(\mathbf A)$ is $m\times m$, has $m$ distinct eigenvalues, and eigenvectors $\mathbf s_1$, $\mathbf s_2$ … $\mathbf s_m$, which are orthogonal.

The $i$-th principal component (some people call them "scores") is the vector
$\mathbf z_i = \mathbf A\mathbf s_i$. In other words, it's a linear combination of the columns of $\mathbf A$, where the coefficients are the components of the $i$-th eigenvector of $\mathbf S$.

I don't understand why $\mathbf z_i$ and $\mathbf z_j$ turn out to be uncorrelated for all $i\neq j$. Does it follow from the fact that $\mathbf s_i$ and $\mathbf s_j$ are orthogonal? Surely not, because I can easily find a matrix $\mathbf B$ and a pair of orthogonal vectors $\mathbf x, \mathbf y$ such that $\mathbf B\mathbf x$ and $\mathbf B\mathbf y$ are correlated.

Best Answer

$$\mathbf z_i^\top \mathbf z_j = (\mathbf A\mathbf s_i)^\top (\mathbf A\mathbf s_j) = \mathbf s_i^\top \mathbf A^\top \mathbf A \mathbf s_j = (n-1) \mathbf s_i^\top \mathbf S \mathbf s_j = (n-1) \mathbf s_i \lambda_j \mathbf s_j = (n-1) \lambda_j \mathbf s_i \mathbf s_j = 0.$$

Related Question