Solved – How to get the principal components of one matrix along the principal directions of another matrix

pcar

I have a data matrix, A, on which I have performed principal component analysis (PCA) using the prcomp function in R. This gives me the rotation (eigenvector) field and the x (rotated data) field.

Now I have another data matrix, B. I want to find the principal components of B along the principal directions of the rotation of A obtained above. How would I go about doing this?

So far, I've just calculated the two prcomp's separately and used a vector projection to project the principal components of B along the eigenvectors of A. Is this correct?

Best Answer

You get the coefficients from PCA. These coefficients are multiplied by your observation matrix to obtain the components. So, multiply rotation by the new observation matrix instead. Don't forget to center it.

Here's the code.

Run PCA and see how the score matrix is obtained from the original data and the rotation. Note, that I'm NOT centering, and you probably should.

> x=matrix(c(1,2,3,2,4,5.5),3,2)
> x
     [,1] [,2]
[1,]    1  2.0
[2,]    2  4.0
[3,]    3  5.5
> r=prcomp(x,retx=1,center=FALSE)
> r$rotation
                PC1        PC2
    [1,] -0.4666132  0.8844615
    [2,] -0.8844615 -0.4666132
    > r$x
           PC1         PC2
[1,] -2.235536 -0.04876479
[2,] -4.471072 -0.09752958
[3,] -6.264378  0.08701220
> x %*% r$rotation
           PC1         PC2
[1,] -2.235536 -0.04876479
[2,] -4.471072 -0.09752958
[3,] -6.264378  0.08701220

Now, apply the same rotation to the different data (again, see that I am NOT centering).

> y=matrix(c(1,2,3,2,4,6.5),3,2)
> y
     [,1] [,2]
[1,]    1  2.0
[2,]    2  4.0
[3,]    3  6.5
> y %*% r$rotation
           PC1         PC2
[1,] -2.235536 -0.04876479
[2,] -4.471072 -0.09752958
[3,] -7.148839 -0.37960095

Note the similarity of the new scores.

By the way, this is used a lot in forecasting with PCA. We obtain the rotation on historical data, then apply it to new data.

Related Solutions

PCA Space Projection – How to Project a New Vector in R

Well, @Srikant already gave you the right answer since the rotation (or loadings) matrix contains eigenvectors arranged column-wise, so that you just have to multiply (using %*%) your vector or matrix of new data with e.g. prcomp(X)$rotation. Be careful, however, with any extra centering or scaling parameters that were applied when computing PCA EVs.

In R, you may also find useful the predict() function, see ?predict.prcomp. BTW, you can check how projection of new data is implemented by simply entering:

getS3method("predict", "prcomp")

Principal Component Analysis – Different Results in SPSS and Stata After Rotation

You are correct. Stata is weird about this. Stata gives different results from SAS, R and SPSS, and it is difficult (in my opinion) to understand why without delving quite deep into the world of factor analysis and PCA.

Here's how you know that something weird is happening. The sum of the squared loadings for a component are equal to the eigenvalue for that component.

Pre-and post-rotation, the eigenvalues change, but the total eigenvalues don't change. Add up the sum of the squared loadings from your output (this is why I asked you to remove the blanks in my comment). With Stata's default, the sum of squared loadings will sum to 1.00 (within rounding error). With SPSS (and R, and SAS, and every other factor analysis program I've looked at) they will sum to the eigenvalue for that factor. (Post rotation eigenvalues change, but the sum of eigenvalues stays the same). The sum of squared loadings in SPSS is equal to the sum of the eigenvalues (i.e. 3.8723 + 1.40682), both pre- and post-rotation.

In Stata, the sum of the squared loadings for each factor is equal to 1.00, and so Stata has rescaled the loadings.

The only mention of this (that I have found) in the Stata documentation is in the estat loadings section of the help, where it says:

cnorm(unit | eigen | inveigen), an option used with estat loadings, selects the normalization of the eigenvectors, the columns of the principal-component loading matrix. The following normalizations are available

However, this appears to apply only to the unrotated component matrix, not the component rotated matrix. I can't get the unnormalized rotated matrix after PCA.

The people at Stata seem to know what they are doing, and usually have a good reason for doing things the way that they do. This one is beyond me though.

(For future reference, it would have made my life easier if you'd used a dataset that I could access, and if you'd included all output, without blanks).

Edit: My usual go-to site for information about how to get the same results for different programs is the UCLA IDRE. They don't cover PCA in Stata: http://www.ats.ucla.edu/stat/AnnotatedOutput/ I have to wonder if that's because they couldn't get the same result. :)

Best Answer

Related Solutions

PCA Space Projection – How to Project a New Vector in R

Principal Component Analysis – Different Results in SPSS and Stata After Rotation

Related Question