Solved – Using varimax-rotated PCA components as predictors in linear regression

factor-rotationpcaregression

After doing PCA, the first component describes the largest part of variability. This is important e.g. in study of body measurements where it is commonly known (Jolliffe, 2002) that PC1 axis captures size variation. My question is whether PCA scores after varimax rotation retain the same properties or are they different as mentioned in this topic?

Since I need PCA scores for further statistical analyses I am wondering if varimax is needed and does it in fact disrupt the representation of real sample variability so that individual scores on rotated axes are uninformative or lead to miss-interpretation of reality?

Also could someone suggest some other references on this topic?

Workflows in R:

  1. PCA (FactoMineR or prcomp) -> Extract individual scores -> Enter scores in the lm
  2. PCA (FactoMiner or prcomp) -> Varimax on loadings matrix -> calculate the individual scores -> enter scores in the lm
  3. FA (psych, varimax and pca extraction method) -> extract individual scores -> Enter scores in the lm

Now, without rotation (1.) percentages of explained variability are i.e. 29.32, 5.6, 3.2, on the first three axes. 2. and 3. solutions yield similar percentages on the first three factors i.e. 12.2, 12.1, 8.2. Off course 1. solution tends to push all high variable loadings on the first axis, while 2. and 3. tend to distribute loadings between axes (which is the reason for rotation). I wanted to know if these three workflows are essential the same since individual scores are different on rotated vs. unrotated axes?

Best Answer

Standardized (to unit variance) principal components after an orthogonal rotation, such as varimax, are simply rotated standardized principal components (by "principal component" I mean PC scores). In linear regression, scaling of individual predictors has no effect and replacing predictors by their linear combinations (e.g. via a rotation) has no effect either. This means that using any of the following in a regression:

  • "raw" principal components (projections on the cov. matrix eigenvectors),
  • standardized principal components,
  • rotated [standardized] principal components,
  • arbitrarily scaled rotated [standardized] principal components,

would lead to exactly the same regression model with identical $R^2$, predictive power, etc. (Individual regression coefficients will of course depend on the normalization and rotation choice.)

The total variance captured by the raw and by the rotated PCs is the same.

This answers your main question. However, you should be careful with your workflows, as it is very easy to get confused and mess up the calculations. The simplest way to obtain standardized rotated PC scores is to use psych::principal function:

 psych::principal(data, rotate="varimax", nfactors=k, scores=TRUE)

Your workflow #2 can be more tricky than you think, because loadings after varimax rotation are not orthogonal, so to obtain the scores you cannot simply project the data onto the rotated loadings. See my answer here for details:

Your workflow #3 is probably also wrong, at least if you refer to the psych::fa function. It does not do PCA; the fm="pa" extraction method refers to "principal factor" method which is based on PCA, but is not identical to PCA (it is an iterative method). As I wrote above, you need psych::principal to perform PCA.

See my answer in the following thread for a detailed account on PCA and varimax:

Related Question