# R PCA – Difference in Eigenvalues Between ‘princomp’ and ‘prcomp’ Functions

pcar

You can use the decathlon dataset {FactoMineR} to reproduce this. The question is why the computed eigenvalues differ from those of the covariance matrix.

Here are the eigenvalues using princomp:

> library(FactoMineR);data(decathlon)
> pr <- princomp(decathlon[1:10], cor=F)
> pr$sd^2 Comp.1 Comp.2 Comp.3 Comp.4 Comp.5 Comp.6 1.348073e+02 2.293556e+01 9.747263e+00 1.117215e+00 3.477705e-01 1.326819e-01 Comp.7 Comp.8 Comp.9 Comp.10 6.208630e-02 4.938498e-02 2.504308e-02 4.908785e-03  And the same using PCA: > res<-PCA(decathlon[1:10], scale.unit=FALSE, ncp=5, graph = FALSE) > res$eig
eigenvalue percentage of variance cumulative percentage of variance
comp 1  1.348073e+02           79.659589641                          79.65959
comp 2  2.293556e+01           13.552956464                          93.21255
comp 3  9.747263e+00            5.759799777                          98.97235
comp 4  1.117215e+00            0.660178830                          99.63252
comp 5  3.477705e-01            0.205502637                          99.83803
comp 6  1.326819e-01            0.078403653                          99.91643
comp 7  6.208630e-02            0.036687700                          99.95312
comp 8  4.938498e-02            0.029182305                          99.98230
comp 9  2.504308e-02            0.014798320                          99.99710
comp 10 4.908785e-03            0.002900673                         100.00000


Can you explain to me why the directly computed eigenvalues differ from those? (the eigenvectors are the same):

> eigen(cov(decathlon[1:10]))$values [1] 1.381775e+02 2.350895e+01 9.990945e+00 1.145146e+00 3.564647e-01 [6] 1.359989e-01 6.363846e-02 5.061961e-02 2.566916e-02 5.031505e-03  Also, the alternative prcomp method gives the same eigenvalues as the direct computation: > prc <- prcomp(decathlon[1:10]) > prc$sd^2
[1] 1.381775e+02 2.350895e+01 9.990945e+00 1.145146e+00 3.564647e-01
[6] 1.359989e-01 6.363846e-02 5.061961e-02 2.566916e-02 5.031505e-03


Why do PCA/princomp and prcomp give different eigenvalues?

#### Best Answer

As pointed out in the comments, it's because princomp uses $N$ for the divisor, but prcomp and the direct calculation using cov both use $N-1$ instead of $N$.

This is mentioned in both the Details section of help(princomp):

Note that the default calculation uses divisor 'N' for the covariance matrix.

and the Details section of help(prcomp):

Unlike princomp, variances are computed with the usual divisor N - 1.

You can also see this in the source. For example, the snippet of princomp source below shows that $N$ (n.obs) is used as the denominator when calculating cv.

else if (is.null(covmat)) {
dn <- dim(z)
if (dn[1L] < dn[2L])
stop("'princomp' can only be used with more units than variables")
covmat <- cov.wt(z)
n.obs <- covmat$n.obs cv <- covmat$cov * (1 - 1/n.obs)
cen <- covmat$center }  You can avoid this multiplication by specifying the covmat argument instead of the x argument. princomp(covmat = cov(iris[,1:4]))$sd^2


Update regarding PCA scores:

You can set cor = TRUE in your call to princomp in order to perform PCA on the correlation matrix (instead of the covariance matrix). This will cause princomp to $z$-score the data, but it will still use $N$ for the denominator.

As as result, princomp(scale(data))$scores and princomp(data, cor = TRUE)$scores will differ by the factor $\sqrt{(N-1)/N}$.