Standardized (to unit variance) principal components after an orthogonal rotation, such as varimax, are simply rotated standardized principal components (by "principal component" I mean PC scores). In linear regression, scaling of individual predictors has no effect and replacing predictors by their linear combinations (e.g. via a rotation) has no effect either. This means that using any of the following in a regression:
- "raw" principal components (projections on the cov. matrix eigenvectors),
- standardized principal components,
- rotated [standardized] principal components,
- arbitrarily scaled rotated [standardized] principal components,
would lead to exactly the same regression model with identical $R^2$, predictive power, etc. (Individual regression coefficients will of course depend on the normalization and rotation choice.)
The total variance captured by the raw and by the rotated PCs is the same.
This answers your main question. However, you should be careful with your workflows, as it is very easy to get confused and mess up the calculations. The simplest way to obtain standardized rotated PC scores is to use psych::principal
function:
psych::principal(data, rotate="varimax", nfactors=k, scores=TRUE)
Your workflow #2 can be more tricky than you think, because loadings after varimax rotation are not orthogonal, so to obtain the scores you cannot simply project the data onto the rotated loadings. See my answer here for details:
Your workflow #3 is probably also wrong, at least if you refer to the psych::fa
function. It does not do PCA; the fm="pa"
extraction method refers to "principal factor" method which is based on PCA, but is not identical to PCA (it is an iterative method). As I wrote above, you need psych::principal
to perform PCA.
See my answer in the following thread for a detailed account on PCA and varimax:
You get the coefficients from PCA. These coefficients are multiplied by your observation matrix to obtain the components. So, multiply rotation by the new observation matrix instead. Don't forget to center it.
Here's the code.
Run PCA and see how the score matrix is obtained from the original data and the rotation. Note, that I'm NOT centering, and you probably should.
> x=matrix(c(1,2,3,2,4,5.5),3,2)
> x
[,1] [,2]
[1,] 1 2.0
[2,] 2 4.0
[3,] 3 5.5
> r=prcomp(x,retx=1,center=FALSE)
> r$rotation
PC1 PC2
[1,] -0.4666132 0.8844615
[2,] -0.8844615 -0.4666132
> r$x
PC1 PC2
[1,] -2.235536 -0.04876479
[2,] -4.471072 -0.09752958
[3,] -6.264378 0.08701220
> x %*% r$rotation
PC1 PC2
[1,] -2.235536 -0.04876479
[2,] -4.471072 -0.09752958
[3,] -6.264378 0.08701220
Now, apply the same rotation to the different data (again, see that I am NOT centering).
> y=matrix(c(1,2,3,2,4,6.5),3,2)
> y
[,1] [,2]
[1,] 1 2.0
[2,] 2 4.0
[3,] 3 6.5
> y %*% r$rotation
PC1 PC2
[1,] -2.235536 -0.04876479
[2,] -4.471072 -0.09752958
[3,] -7.148839 -0.37960095
Note the similarity of the new scores.
By the way, this is used a lot in forecasting with PCA. We obtain the rotation on historical data, then apply it to new data.
Best Answer
This is going to be a non-technical answer.
You are right: PCA is essentially a rotation of the coordinate axes, chosen such that each successful axis captures as much variance as possible.
In some disciplines (such as e.g. psychology), people like to apply PCA in order to interpret the resulting axes. I.e. they want to be able to say that principal axis #1 (which is a certain linear combination of original variables) has some particular meaning. To guess this meaning they would look at the weights in the linear combination. However, these weights are often messy and no clear meaning can be discerned.
In these cases, people sometimes choose to tinker a bit with the vanilla PCA solution. They take certain number of principal axes (that are deemed "significant" by some criterion), and additionally rotate them, trying to achieve some "simple structure" --- that is, linear combinations that would be easier to interpret. There are specific algorithms that look for the simplest possible structure; one of them is called varimax. After varimax rotation, successive components do not anymore capture as much variance as possible! This feature of PCA gets broken by doing the additional varimax (or any other) rotation.
So before applying varimax rotation, you have "unrotated" principal components. And afterwards, you get "rotated" principal components. In other words, this terminology refers to the post-processing of the PCA results and not to the PCA rotation itself.
All of this is somewhat complicated by the fact that what gets rotated are loadings and not principal axes as such. However, for the mathematical details I refer you (and any interested reader) to my long answer here: Is PCA followed by a rotation (such as varimax) still PCA?