Solved – Very different results of principal component analysis in SPSS and Stata after rotation

factor analysisfactor-rotationpcaspssstata

For my PhD thesis I have to do a Principal Component Analysis (PCA). I didn't find it too difficult in Stata and was happy interpreting the results (I know there is a difference between factor and principal component analysis). However, I discussed it with a colleague who uses SPSS, so I imported my data (from Excel) into SPSS too, and performed a PCA in there as well.

Shockingly for me, the results differ enormously from my Stata results (after rotation). Not even close to it.

How can that be? (See Stata PCA and SPSS PCA codes and results below).

Even stranger to me: When I did a factor [varnames], pcf (principal-component factor) in Stata I received (almost) the same results as for PCA in SPSS (see Stata principal-component factor below).

What is principal component factors? A mixture of PCA and factor analysis?

I am confused. If people report in journals having done a PCA: should I then ask, with SPSS or Stata? Can anyone explain it to me?

Stata:

pca bewert_sfu_a bewert_sfu_b bewert_sfu_c bewert_sfu_d bewert_sfu_e bewert_sfu_f bewert_sfu_g bewert_sfu_h bewert_sfu_i bewert_sfu_j bewert_sfu_k bewert_sfu_l, mineigen(1)

Principal components/correlation
Number of obs = 158
Number of comp. = 3
Trace = 12
Rotation: (unrotated = principal) Rho = 0.5382

--------------------------------------------------------------------------
   Component |   Eigenvalue   Difference         Proportion   Cumulative
-------------+------------------------------------------------------------
       Comp1 |       3.8723      2.46548             0.3227       0.3227
       Comp2 |      1.40682      .227718             0.1172       0.4399
       Comp3 |       1.1791      .206742             0.0983       0.5382
       Comp4 |      .972359      .169164             0.0810       0.6192
       Comp5 |      .803195      .050871             0.0669       0.6861
       Comp6 |      .752324     .0953662             0.0627       0.7488
       Comp7 |      .656957     .0137592             0.0547       0.8036
       Comp8 |      .643198      .135894             0.0536       0.8572
       Comp9 |      .507304     .0435925             0.0423       0.8995
      Comp10 |      .463711     .0749052             0.0386       0.9381
      Comp11 |      .388806     .0348752             0.0324       0.9705
      Comp12 |      .353931            .             0.0295       1.0000
--------------------------------------------------------------------------

Principal components (eigenvectors)

----------------------------------------------------------
    Variable |    Comp1     Comp2     Comp3 | Unexplained 
-------------+------------------------------+-------------
bewert_sfu_a |   0.2700    0.3901   -0.1477 |       .4779 
bewert_sfu_b |   0.3298    0.2303   -0.4027 |       .3129 
bewert_sfu_c |  -0.3046    0.3149    0.1773 |       .4642 
bewert_sfu_d |   0.3489    0.1910    0.0700 |       .4715 
bewert_sfu_e |   0.3342    0.2067    0.2720 |       .4202 
bewert_sfu_f |  -0.2001    0.4561   -0.1587 |       .5227 
bewert_sfu_g |   0.3057    0.3128    0.1531 |       .4728 
bewert_sfu_h |  -0.3611    0.2180    0.2913 |        .328 
bewert_sfu_i |   0.2352   -0.2211    0.3662 |       .5588 
bewert_sfu_j |  -0.1556    0.3894    0.4578 |       .4457 
bewert_sfu_k |   0.3239    0.0525    0.0754 |       .5832 
bewert_sfu_l |   0.2091   -0.2445    0.4720 |       .4839 
----------------------------------------------------------

rotate, varimax kaiser

Principal components/correlation Number of obs = 158
Number of comp. = 3
Trace = 12
Rotation: orthogonal varimax (Kaiser on) Rho = 0.5382

--------------------------------------------------------------------------
   Component |     Variance   Difference         Proportion   Cumulative
-------------+------------------------------------------------------------
       Comp1 |      2.95242      .867357             0.2460       0.2460
       Comp2 |      2.08506       .66433             0.1738       0.4198
       Comp3 |      1.42073            .             0.1184       0.5382
--------------------------------------------------------------------------

Rotated components

----------------------------------------------------------
    Variable |    Comp1     Comp2     Comp3 | Unexplained 
-------------+------------------------------+-------------
bewert_sfu_a |   0.4076   -0.0266   -0.2829 |       .4779 
bewert_sfu_b |   0.3116   -0.3063   -0.3648 |       .3129 
bewert_sfu_c |  -0.0255    0.4536   -0.1302 |       .4642 
bewert_sfu_d |   0.4007   -0.0456    0.0218 |       .4715 
bewert_sfu_e |   0.4392    0.0965    0.1618 |       .4202 
bewert_sfu_f |   0.0698    0.2650   -0.4451 |       .5227 
bewert_sfu_g |   0.4531    0.0973    0.0005 |       .4728 
bewert_sfu_h |  -0.1026    0.5023    0.0011 |        .328 
bewert_sfu_i |   0.1350   -0.0261    0.4684 |       .5588 
bewert_sfu_j |   0.1927    0.5856    0.0731 |       .4457 
bewert_sfu_k |   0.3026   -0.1048    0.1037 |       .5832 
bewert_sfu_l |   0.1224    0.0410    0.5564 |       .4839 
----------------------------------------------------------

Component rotation matrix

--------------------------------------------
             |    Comp1     Comp2     Comp3 
-------------+------------------------------
       Comp1 |   0.7942   -0.5573    0.2422 
       Comp2 |   0.5724    0.5523   -0.6061 
       Comp3 |   0.2040    0.6200    0.7576 
--------------------------------------------

SPSS Code:

FACTOR
/VARIABLES bewert_sfu_a bewert_sfu_b bewert_sfu_c bewert_sfu_d  bewert_sfu_e bewert_sfu_f bewert_sfu_g bewert_sfu_h bewert_sfu_i bewert_sfu_j bewert_sfu_k bewert_sfu_l
/MISSING LISTWISE
/ANALYSIS bewert_sfu_a bewert_sfu_b bewert_sfu_c bewert_sfu_d bewert_sfu_e bewert_sfu_f bewert_sfu_g bewert_sfu_h bewert_sfu_i bewert_sfu_j bewert_sfu_k bewert_sfu_l
/PRINT EXTRACTION ROTATION
/FORMAT BLANK(.40)
/CRITERIA MINEIGEN(1) ITERATE(50)
/EXTRACTION PC
/CRITERIA ITERATE(50)
/ROTATION VARIMAX
/METHOD=CORRELATION.

Descriptive Statistics

                Mean    Std. Deviation  Analysis N
bewert_sfu_a    3.79              .452  158
bewert_sfu_b    3.68              .506  158
bewert_sfu_c    1.61              .827  158
bewert_sfu_d    3.32              .619  158
bewert_sfu_e    3.03              .643  158
bewert_sfu_f    2.61              .812  158
bewert_sfu_g    3.32              .621  158
bewert_sfu_h    1.53              .796  158
bewert_sfu_i    2.10              .838  158
bewert_sfu_j    2.53              .819  158
bewert_sfu_k    3.29              .784  158
bewert_sfu_l    2.78              .842  158`

Component Matrix a

                       Component        
                   1       2       3
bewert_sfu_a    .531    .463    
bewert_sfu_b    .649           -.437
bewert_sfu_c   -.599        
bewert_sfu_d    .687        
bewert_sfu_e    .658        
bewert_sfu_f            .541    
bewert_sfu_g    .602        
bewert_sfu_h   -.711        
bewert_sfu_i    .463        
bewert_sfu_j            .462    .497
bewert_sfu_k    .637        
bewert_sfu_l    .412            .513

Extraction Method: Principal Component Analysis.
a 3 components extracted.

Communalities

                Extraction
bewert_sfu_a    .522
bewert_sfu_b    .687
bewert_sfu_c    .536
bewert_sfu_d    .529
bewert_sfu_e    .580
bewert_sfu_f    .477
bewert_sfu_g    .527
bewert_sfu_h    .672
bewert_sfu_i    .441
bewert_sfu_j    .554
bewert_sfu_k    .417
bewert_sfu_l    .516
Extraction Method: Principal Component Analysis.    

Rotated Component Matrix a

                      Component     
                   1      2      3
bewert_sfu_a    .705        
bewert_sfu_b    .673  -.448 
bewert_sfu_c           .627 
bewert_sfu_d    .671        
bewert_sfu_e    .661        
bewert_sfu_f                 -.576
bewert_sfu_g    .699        
bewert_sfu_h           .698 
bewert_sfu_i                  .630
bewert_sfu_j           .742 
bewert_sfu_k    .528        
bewert_sfu_l                  .707

Extraction Method: Principal Component Analysis. 
Rotation Method: Varimax with Kaiser Normalization.         
a Rotation converged in 5 iterations.           

Component Transformation Matrix

Component      1        2      3
1           .765    -.476   .434
2           .644     .567  -.513
3          -.001     .672   .741
Extraction Method: Principal Component Analysis.
Rotation Method: Varimax with Kaiser Normalization.         

Stata factor analysis/correlation
Number of obs = 158
Method: principal-component factors
Retained factors = 3
Rotation: (unrotated)
Number of params = 33

--------------------------------------------------------------------------
     Factor  |   Eigenvalue   Difference        Proportion   Cumulative
-------------+------------------------------------------------------------
    Factor1  |      3.87230      2.46548            0.3227       0.3227
    Factor2  |      1.40682      0.22772            0.1172       0.4399
    Factor3  |      1.17910      0.20674            0.0983       0.5382
    Factor4  |      0.97236      0.16916            0.0810       0.6192
    Factor5  |      0.80319      0.05087            0.0669       0.6861
    Factor6  |      0.75232      0.09537            0.0627       0.7488
    Factor7  |      0.65696      0.01376            0.0547       0.8036
    Factor8  |      0.64320      0.13589            0.0536       0.8572
    Factor9  |      0.50730      0.04359            0.0423       0.8995
   Factor10  |      0.46371      0.07491            0.0386       0.9381
   Factor11  |      0.38881      0.03488            0.0324       0.9705
   Factor12  |      0.35393            .            0.0295       1.0000
--------------------------------------------------------------------------
LR test: independent vs. saturated:  chi2(66) =  453.95 Prob>chi2 = 0.0

Factor loadings (pattern matrix) and unique variances

-----------------------------------------------------------
    Variable |  Factor1   Factor2   Factor3 |   Uniqueness 
-------------+------------------------------+--------------
bewert_sfu_a |   0.5314    0.4627   -0.1603 |      0.4779  
bewert_sfu_b |   0.6490    0.2732   -0.4373 |      0.3129  
bewert_sfu_c |  -0.5994    0.3735    0.1926 |      0.4642  
bewert_sfu_d |   0.6866    0.2265    0.0760 |      0.4715  
bewert_sfu_e |   0.6576    0.2451    0.2954 |      0.4202  
bewert_sfu_f |  -0.3938    0.5409   -0.1723 |      0.5227  
bewert_sfu_g |   0.6015    0.3710    0.1663 |      0.4728  
bewert_sfu_h |  -0.7107    0.2586    0.3163 |      0.3280  
bewert_sfu_i |   0.4629   -0.2622    0.3977 |      0.5588  
bewert_sfu_j |  -0.3062    0.4619    0.4971 |      0.4457  
bewert_sfu_k |   0.6373    0.0623    0.0818 |      0.5832  
bewert_sfu_l |   0.4116   -0.2900    0.5125 |      0.4839  

rotate, varimax kaiser blanks(.4)

Factor analysis/correlation
Number of obs = 158
Method: principal-component factors
Retained factors = 3
Rotation: orthogonal varimax (Kaiser on)
Number of params = 33

--------------------------------------------------------------------------
     Factor  |     Variance   Difference        Proportion   Cumulative
-------------+------------------------------------------------------------
    Factor1  |      2.84986      0.98705            0.2375       0.2375
    Factor2  |      1.86281      0.11727            0.1552       0.3927
    Factor3  |      1.74554            .            0.1455       0.5382
--------------------------------------------------------------------------
LR test: independent vs. saturated:  chi2(66) =  453.95 Prob>chi2 = 0.0000

Rotated factor loadings (pattern matrix) and unique variances

-----------------------------------------------------------
    Variable |  Factor1   Factor2   Factor3 |   Uniqueness 
-------------+------------------------------+--------------
bewert_sfu_a |   0.7047   -0.0983   -0.1258 |      0.4779  
bewert_sfu_b |   0.6732   -0.4479   -0.1827 |      0.3129  
bewert_sfu_c |  -0.2184    0.6266   -0.3090 |      0.4642  
bewert_sfu_d |   0.6710   -0.1473    0.2377 |      0.4715  
bewert_sfu_e |   0.6605    0.0245    0.3781 |      0.4202  
bewert_sfu_f |   0.0474    0.3785   -0.5761 |      0.5227  
bewert_sfu_g |   0.6989    0.0358    0.1935 |      0.4728  
bewert_sfu_h |  -0.3776    0.6976   -0.2067 |      0.3280  
bewert_sfu_i |   0.1847   -0.1019    0.6298 |      0.5588  
bewert_sfu_j |   0.0624    0.7419   -0.0018 |      0.4457  
bewert_sfu_k |   0.5276   -0.2131    0.3050 |      0.5832  
bewert_sfu_l |   0.1273   -0.0160    0.7069 |      0.4839  
-----------------------------------------------------------

Factor rotation matrix

-----------------------------------------
             | Factor1  Factor2  Factor3 
-------------+---------------------------
     Factor1 |  0.7650  -0.4761   0.4336 
     Factor2 |  0.6440   0.5672  -0.5134 
     Factor3 | -0.0016   0.6720   0.7406 
-----------------------------------------

Best Answer

You are correct. Stata is weird about this. Stata gives different results from SAS, R and SPSS, and it is difficult (in my opinion) to understand why without delving quite deep into the world of factor analysis and PCA.

Here's how you know that something weird is happening. The sum of the squared loadings for a component are equal to the eigenvalue for that component.

Pre-and post-rotation, the eigenvalues change, but the total eigenvalues don't change. Add up the sum of the squared loadings from your output (this is why I asked you to remove the blanks in my comment). With Stata's default, the sum of squared loadings will sum to 1.00 (within rounding error). With SPSS (and R, and SAS, and every other factor analysis program I've looked at) they will sum to the eigenvalue for that factor. (Post rotation eigenvalues change, but the sum of eigenvalues stays the same). The sum of squared loadings in SPSS is equal to the sum of the eigenvalues (i.e. 3.8723 + 1.40682), both pre- and post-rotation.

In Stata, the sum of the squared loadings for each factor is equal to 1.00, and so Stata has rescaled the loadings.

The only mention of this (that I have found) in the Stata documentation is in the estat loadings section of the help, where it says:

cnorm(unit | eigen | inveigen), an option used with estat loadings, selects the normalization of the eigenvectors, the columns of the principal-component loading matrix. The following normalizations are available

However, this appears to apply only to the unrotated component matrix, not the component rotated matrix. I can't get the unnormalized rotated matrix after PCA.

The people at Stata seem to know what they are doing, and usually have a good reason for doing things the way that they do. This one is beyond me though.

(For future reference, it would have made my life easier if you'd used a dataset that I could access, and if you'd included all output, without blanks).

Edit: My usual go-to site for information about how to get the same results for different programs is the UCLA IDRE. They don't cover PCA in Stata: http://www.ats.ucla.edu/stat/AnnotatedOutput/ I have to wonder if that's because they couldn't get the same result. :)

Related Question