Solved – PCA and FA example – calculation of communalities

factor analysispca

I'm trying to understand how Principal Component Analysis and Factor Analysis work by
implementing examples. Although I'm mainly using Python and Numpy here, this isn't Python-specific, as I'd like to know how to get the correct result generally speaking.

There is an example in chapter 16 of Statistics in a Nutshell, but
I can't understand how some of the resulting values are obtained. I'm not sure whether (a) there's something wrong with my implementation/calculations or (b) I'm misunderstanding the "jump" from PCA to FA.
(The examples are on page 300-303. It wouldn't be appropriate to reproduce material from the book, but I hope it's reasonable to quote its short numerical example.)

From the example table (psychometric test results) in the data matrix below:

import numpy as np

# (Table 16-1 in the book)
# Columns are:
# Reading, Music, Arith., Verbal, Sports, Spelling, Geometry
data = np.array(
[[  8,  9,  6,  8,  5,  9, 10],
 [  5,  6,  5,  5,  6,  5,  5],
 [  2,  3,  2,  6,  8,  6,  4],
 [  8,  9, 10,  9,  8, 10,  6],
 [ 10,  7,  1, 10,  5, 10,  2],
 [  9,  8,  4,  9,  1,  7,  2],
 [  3,  9, 10,  2,  6,  4,  9],
 [  8, 10,  3,  8,  5,  7,  2],
 [ 10,  9,  3, 10,  6, 10,  3],
 [  7, 10,  1,  9,  6, 10,  2]])

# Transposing for np.corrcoef
data = data.T

corrmat = np.corrcoef(data)

The values calculated this way using NumPy for the correlation matrix match those in the book (Table 16-2 for those who have it).

Here is the same input data for R:

data <- matrix(c(8, 9, 6, 8, 5, 9, 10, 5, 6, 5, 5, 6, 5, 5, 2, 3, 2, 6, 8, 6, 4, 8, 9, 10, 9, 8, 10, 6, 10, 7, 1, 10, 5, 10, 2, 9, 8, 4, 9, 1, 7, 2, 3, 9, 10, 2, 6, 4, 9, 8, 10, 3, 8, 5, 7, 2, 10, 9, 3, 10, 6, 10, 3, 7, 10, 1, 9, 6, 10, 2), ncol=7, byrow=T)
pc = prcomp(data, scale. = T)

NumPy's output for the correlation matrix is this:

  [[ 1.        ,  0.53484056, -0.25290194,  0.86021546, -0.46870501,
     0.76225482, -0.38632342],
   [ 0.53484056,  1.        ,  0.24875809,  0.26248971, -0.26308503,
     0.38020761,  0.06879192],
   [-0.25290194,  0.24875809,  1.        , -0.50102461,  0.20615027,
    -0.30668865,  0.75803231],
   [ 0.86021546,  0.26248971, -0.50102461,  1.        , -0.23649405,
     0.89522479, -0.56880086],
   [-0.46870501, -0.26308503,  0.20615027, -0.23649405,  1.        ,
     0.05436758,  0.26604241],
   [ 0.76225482,  0.38020761, -0.30668865,  0.89522479,  0.05436758,
     1.        , -0.29078439],
   [-0.38632342,  0.06879192,  0.75803231, -0.56880086,  0.26604241,
    -0.29078439,  1.        ]]

I'm using the following to perform the PCA. The data matrix is turned into the pca_output matrix.

The cummulative % also match the book's example (Table 16-4).

eigenvalues, eigenvectors = np.linalg.eig(corrmat)

# Order the eigenvalues by decreasing value
# (and then order eigenvectors).
evals_order = np.argsort(-eigenvalues)
eigenvalues = eigenvalues[evals_order]
eigenvectors = eigenvectors[:, evals_order]

# pca_output: columns are the principal
# components, after transposition (used here)
pca_output = np.dot(eigenvectors.T, data).T

cummulative_perc_variance = 100 * np.array(
    [ eigenvalues[:i].sum()/eigenvalues.sum() for i in range(1, eigenvalues.shape[0] + 1)])

The first 3 eigenvectors I get are:

  [[-0.48310951, -0.2554222 , -0.08049741],
   [-0.20665755, -0.60274754, -0.16463817],
   [ 0.31165047, -0.56586657,  0.02618752],
   [-0.5112334 , -0.00690029,  0.19728832],
   [ 0.21569808,  0.04569528,  0.83434948],
   [-0.43878073, -0.18261299,  0.4643702 ],
   [ 0.35546879, -0.46450701,  0.12258562]]

I get the same results using R's prcomp:

> pc
Standard deviations:
[1] 1.8676341 1.2850439 1.0568965 0.6517665 0.4837756 0.2582021 0.1344177

Rotation:
            PC1          PC2         PC3        PC4        PC5         PC6         PC7
[1,] -0.4831095  0.255422196 -0.08049741  0.2324959 -0.1897195  0.76737470 -0.12638483
[2,] -0.2066575  0.602747536 -0.16463817 -0.6995409  0.2296628 -0.05498277  0.14750200
[3,]  0.3116505  0.565866565  0.02618752  0.1601325 -0.7081318 -0.22419271 -0.06802832
[4,] -0.5112334  0.006900293  0.19728832  0.2433956 -0.1357962 -0.28997811  0.73341720
[5,]  0.2156981 -0.045695275  0.83434948 -0.3293147 -0.1060965  0.33662557  0.14908327
[6,] -0.4387807  0.182612988  0.46437020  0.1413347  0.1812280 -0.38481239 -0.59798379
[7,]  0.3554688  0.464507012  0.12258562  0.4932350  0.5892964  0.11120226  0.19982732

The book then produces this table (Table 16-3: Communalities) and says: "The first step after computing PCA is to examine what proportion of variance is accounted for by the factor structure.
This is done by examining the communalities […]."

$$
\begin{array}{r|r|r}
& \mbox{Initial} & \mbox{Extraction} \\
\hline
\mbox{Reading} & 1.0 & 0.929 \\
\mbox{Music} &1.0 & 0.779 \\
\mbox{Arith.} &1.0 & 0.868 \\
\mbox{Verbal} &1.0 & 0.955 \\
\mbox{Sports} &1.0 & 0.943 \\
\mbox{Spelling} &1.0 & 0.967 \\
\mbox{Geometry} &1.0 & 0.814 \\
\end{array}
$$

It's not clear to me where these extracted communalities values come from.
My understanding is that they are the sum of the squares of the loadings in a given row of the eigenvector matrix (over the first p columns where p is the number of selected components).
According to the eigenvalues I get above, this would be $(-0.483)^2 + (-0.255)^2 + (-0.080)^2 = 0.305$ for "Reading" (first row) and 3 Components, and not $0.929$ (as in the table in the book).

The book shows a table (Table 16-7) called "unrotated component matrix", as follows:

$$
\begin{array}{r|r|r}
& Comp 1 & Comp 2 & Comp 3 \\
\hline
\mbox{Reading} & 0.902 & 0.328 & -0.085 \\
\mbox{Music} & 0.386 & … & … \\
\mbox{Arith.} & -0.582 & … & … \\
\mbox{Verbal} & 0.955 & … & … \\
\mbox{Sports} & -0.403 & … & … \\
\mbox{Spelling} & 0.819 & … & … \\
\mbox{Geometry} & -0.664 & … & … \\
\end{array}
$$

This table does explain the communalities values ($(0.902)^2 + (0.328)^2 + (-0.085)^2 = 0.929$).

My understanding was that the unrotated component matrix when doing FA was the same as the matrix of eigenvectors obtained for the PCA.
From these examples, this doesn't seem to be the case.

How should the "unrotated component matrix" be obtained and how does it differ from the PCA eigenvectors?

Best Answer

Component/factor matrix is the matrix of component/factor loadings. "Loading" pertains to it, not to eigenvector matrix. That matrix is obtained from eigenvector matrix by normalizing the columns of the latter: column sum-of-squares (which are all 1 there) are brought to corresponding eigenvalues. $a_{ij}=\sqrt{\lambda_j}u_{ij}$, where $a_{ij}$ is the loading and $u_{ij}$ is the element of eigenvector matrix and $\lambda_j$ is the eigenvalue.