[Math] the principal components matrix in PCA with SVD

linear algebramatricesprincipal component analysis

Doing PCA on a matrix using SVD yields a result of three matrices, expressed as:

$$
M = U \Sigma V^T
$$

where $M$ is our initial data with zero mean.

If we want to make a plot of the two principle components we project the data onto principal component space.

$$
Z = M * V
$$

and then use the two first columns of Z for our plot. Maybe I have already answered my own question, but I am struggling to understand if $Z$ is what would be called the Principle Component matrix, and if not, how do we find that?

Also, I am not sure what the operation $M*V$ does to the data. As I understand it, $V$ is an expression of the general trends of each of the attributes in the data set. By calculating the dot product between our data $M$ and the trends $V$ of the data, we end up with a matrix (PC matrix?) that captures the original data in a structured manner which allows for dimensionality reduction.

Are my assumptions correct, or have I misread the theory?

Best Answer

Ok - Ill give it a shot. Let me start from the top and try to recap PCA, and then show the connection to SVD.

Recall: For PCA, We begin with a centered $M$ (dim = $(n,d)$) of data. For this data, we compute the sample covariance : $S = \frac{1}{n-1}M^TM$ where $n$ is the number of data points.

For this covariance matrix we find the eigenvectors and eigenvalues. Corresponding to the largest eigenvalues we select $l$ eigenvectors. Lets call the matrix consisting of these eigenvectors $W$.

$W$ will have dimensions $d \times l$ . Then we can write $Z = MW$ and we understand that each row of Z is a lower dimensional embedding of $m_i$ a row of $M$.

OK now suppose we can write $M = U S V^T$. Then we notice that :

\begin{align} M^TM &= VS^TU^TUSV^T\\ &= V(S^TS)V^T \text{(since U is orthonormal)} \\ &= VDV^T \end{align} Where $D = S^2$ is a diagonal matrix containing the squares of singular values.

Thus we have that $(M^TM)V = VD$ since $V$ is also orthonormal. Aha! So we can see that the columns of V are the eigenvectors of $M^TM$ and $D$ contains the eigenvalues. This $V$ is precisely what we called $W$ above.

Hope that clears things up a bit. Sorry for the delay :)