Solved – Difference between loadings and correlations between observed variables and factor saved scores in factor analysis

factor analysisr

I thought that the loadings in factor analysis were the correlations between the observed variables and the latent factors. However, when I do factor analysis in R using the psych package, this does not seem to be the case:

    library(psych)
    set.seed(1)
    X <- matrix(rnorm(200), ncol=10)
    fa1 <- fa(X, nfactors=3, rotate="none", scores=TRUE)

    cor(X, fa1$scores)  #correlations between original variables and factor scores
                   MR2         MR1         MR3
     [1,]  0.465509161  0.87299813  0.03241641
     [2,] -0.010609644 -0.32714571  0.64968725
     [3,] -0.219685860  0.47331827 -0.39132195
     [4,] -0.815516983  0.22669390  0.42273446
     [5,] -0.075178935 -0.40431701 -0.69661843
     [6,] -0.204917832  0.07472006  0.05508017
     [7,]  0.240675941  0.13027263  0.23238220
     [8,]  0.756677687 -0.05621205  0.23746738
     [9,]  0.004384459  0.12095273  0.55100943
    [10,]  0.640507568 -0.67810600  0.18597947

    fa1$loadings[1:10, 1:3]
                   MR2         MR1         MR3
     [1,]  0.433925641  0.82218385  0.02717957
     [2,] -0.009889808 -0.30810366  0.54473104
     [3,] -0.204780777  0.44576800 -0.32810435
     [4,] -0.760186392  0.21349881  0.35444221
     [5,] -0.070078250 -0.38078308 -0.58408054
     [6,] -0.191014719  0.07037085  0.04618204
     [7,]  0.224346738  0.12268990  0.19484113
     [8,]  0.705339180 -0.05294013  0.19910480
     [9,]  0.004086985  0.11391248  0.46199451
    [10,]  0.597050885 -0.63863574  0.15593470

    cor(fa1$scores)  # Check that factor scores are uncorrelated
              MR2          MR1           MR3
    MR2  1.000000e+00 4.266996e-16 -1.299606e-16
    MR1  4.266996e-16 1.000000e+00  1.961151e-16
    MR3 -1.299606e-16 1.961151e-16  1.000000e+00

The loadings and correlations are similar, but I expected them to be the same. I tried looking at the source code for fa but had trouble understanding it. Could someone please tell me how the loadings differ from the correlations?

Update: For each factor, the correlations with the observed variables are constant multiples of the loadings:

cor(X, fa1$scores)/fa1$loadings[1:10, 1:3]
           MR2      MR1      MR3
 [1,] 1.072786 1.061804 1.192675
 [2,] 1.072786 1.061804 1.192675
 [3,] 1.072786 1.061804 1.192675
 [4,] 1.072786 1.061804 1.192675
 [5,] 1.072786 1.061804 1.192675
 [6,] 1.072786 1.061804 1.192675
 [7,] 1.072786 1.061804 1.192675
 [8,] 1.072786 1.061804 1.192675
 [9,] 1.072786 1.061804 1.192675
[10,] 1.072786 1.061804 1.192675

Best Answer

I don't know R very well, so I can't track your code. But factor scores (unless the factors are simply principal components) are always approximate: exact scores cannot be computed because the uniqueness value for each case and variable is eternally unobservable. Thus, observed correlations between computed factor scores and the variables only approximate true correlations between factors and variables, the loadings.