Solved – How is PCA applied to new data

eigenvaluesintuitionpca

I understand the basic intuition behind PCA: reducing the dimensionality of data by finding the eigenvectors along which there is most variance in the data, and projecting the data along these eigenvectors (the principal components).

What I don't understand is the following:

  • How are the eigenvectors found? A standard eigenvector equation is given by $Av=\lambda v$, where $\lambda$ and $v$ are the eigenvalues and eigenvectors respectively. So what is the $A$ matrix – the data itself, or the covariance matrix of the data… or something else? (If the data matrix isnt square then this equation doesn't hold.)

  • Once PCA has been performed / trained on a data set, can it be applied to reduce the dimensionality of new unseen data? For this to be true, I suppose a mapping would need to be output by PCA, and this mapping could be applied to the new data, say in the form of matrix multiplication.

    1. What are the outputs of PCA?
    2. How are the outputs applied to new data, if at all?

Best Answer

I will answer each question:

  • $A$ is indeed the covariance matrix (so $X^TX$ assuming $X$ is standardized)
  • The output of PCA is 3 things: the vector of column means $\mu$ of $X$, the vector of column stddevs $\sigma$ of $X$ and the rotation matrix $R = [v_1 ... v_p]$. Therefore, for a new sample row $x_0^T$, to compute its projection onto Principal Component Space, you have to standardize and rotate, that is $((x_0 - \mu) / \sigma)^T R$, which will yield a row vector with $x_0$ in PC coordinates. Please note that here I'm dividing by $\sigma$ elementwise.
Related Question