R prcomp Function – Differences Between x Values and Rotation Values in prcomp Function in R

pcar

The prcomp function in R returns a class containing the following components:

  • sdev: I'm not sure what these are, but I know that squaring them gives the eigenvalues.
  • rotation: The above documentation states that this is "the matrix of variable loadings (i.e., a matrix whose columns contain the eigenvectors)".
  • x: My understanding is that these are the principal components.

My understanding is that the principal components are the elements of the eigenvectors. If so, then this would mean that the x are the elements of the columns of rotation, right? But wouldn't this means that the vectors x and rotation are the same? Or am I misunderstanding this?

Best Answer

"rotation" are the principal components (the eigenvectors of the covariance matrix), in the original coordinate system. Typically a square matrix (unless you truncate it by introducing tolerance) with the same number of dimensions your original data had. E.g. if you had a 3D data set, your rotation matrix will be 3-by-3.

"x" is your data set projected on the principal components ("rotated"). It has the same dimensions as your original data set (again, assuming you don't truncate low-ranking PCs).

Here is an example: Assume your data set looks like this:

original data set

(the red and the blue line are the first and the second principal components, respecitvely).

After performing the PCA, you can plot it in terms of its PCs:

PCA-rotated data set

And here is the code, for reproducibility:

library(tidyverse)
theme_set(theme_bw())

set.seed(1)
N = 1000
X   = matrix(c(rnorm(N, 0, 3), rnorm(N, 0, 1)), ncol=2)
phi = 30/180*pi
M = matrix(c(cos(phi), sin(phi), -sin(phi), cos(phi)), nrow=2, byrow=T)
X = X %*% M
p = prcomp(X, center=T)

p

X %>% as_tibble %>% ggplot(aes(V1, V2)) +
  geom_point(size=.5) +
  xlim(-15, 15) + ylim(-15, 15) +
  geom_abline(slope=p$rotation[1, 2] / p$rotation[1, 1], colour="red") +
  geom_abline(slope=p$rotation[2, 2] / p$rotation[2, 1], colour="blue") +
  coord_fixed(ratio = 1)

p$x %>% as_tibble %>% ggplot(aes(PC1, PC2)) +
  geom_point(size=.5) +
  xlim(-15, 15) + ylim(-15, 15) +
  geom_hline(yintercept = 0, colour="red") +
  geom_vline(xintercept = 0, colour="blue") +
  coord_fixed(ratio = 1)

Related Question