For my PhD thesis I have to do a Principal Component Analysis (PCA). I didn't find it too difficult in Stata and was happy interpreting the results (I know there is a difference between factor and principal component analysis). However, I discussed it with a colleague who uses SPSS, so I imported my data (from Excel) into SPSS too, and performed a PCA in there as well.
Shockingly for me, the results differ enormously from my Stata results (after rotation). Not even close to it.
How can that be? (See Stata PCA and SPSS PCA codes and results below).
Even stranger to me: When I did a factor [varnames], pcf
(principal-component factor) in Stata I received (almost) the same results as for PCA in SPSS (see Stata principal-component factor below).
What is principal component factors? A mixture of PCA and factor analysis?
I am confused. If people report in journals having done a PCA: should I then ask, with SPSS or Stata? Can anyone explain it to me?
Stata:
pca bewert_sfu_a bewert_sfu_b bewert_sfu_c bewert_sfu_d bewert_sfu_e bewert_sfu_f bewert_sfu_g bewert_sfu_h bewert_sfu_i bewert_sfu_j bewert_sfu_k bewert_sfu_l, mineigen(1)
Principal components/correlation
Number of obs = 158
Number of comp. = 3
Trace = 12
Rotation: (unrotated = principal) Rho = 0.5382
--------------------------------------------------------------------------
Component | Eigenvalue Difference Proportion Cumulative
-------------+------------------------------------------------------------
Comp1 | 3.8723 2.46548 0.3227 0.3227
Comp2 | 1.40682 .227718 0.1172 0.4399
Comp3 | 1.1791 .206742 0.0983 0.5382
Comp4 | .972359 .169164 0.0810 0.6192
Comp5 | .803195 .050871 0.0669 0.6861
Comp6 | .752324 .0953662 0.0627 0.7488
Comp7 | .656957 .0137592 0.0547 0.8036
Comp8 | .643198 .135894 0.0536 0.8572
Comp9 | .507304 .0435925 0.0423 0.8995
Comp10 | .463711 .0749052 0.0386 0.9381
Comp11 | .388806 .0348752 0.0324 0.9705
Comp12 | .353931 . 0.0295 1.0000
--------------------------------------------------------------------------
Principal components (eigenvectors)
----------------------------------------------------------
Variable | Comp1 Comp2 Comp3 | Unexplained
-------------+------------------------------+-------------
bewert_sfu_a | 0.2700 0.3901 -0.1477 | .4779
bewert_sfu_b | 0.3298 0.2303 -0.4027 | .3129
bewert_sfu_c | -0.3046 0.3149 0.1773 | .4642
bewert_sfu_d | 0.3489 0.1910 0.0700 | .4715
bewert_sfu_e | 0.3342 0.2067 0.2720 | .4202
bewert_sfu_f | -0.2001 0.4561 -0.1587 | .5227
bewert_sfu_g | 0.3057 0.3128 0.1531 | .4728
bewert_sfu_h | -0.3611 0.2180 0.2913 | .328
bewert_sfu_i | 0.2352 -0.2211 0.3662 | .5588
bewert_sfu_j | -0.1556 0.3894 0.4578 | .4457
bewert_sfu_k | 0.3239 0.0525 0.0754 | .5832
bewert_sfu_l | 0.2091 -0.2445 0.4720 | .4839
----------------------------------------------------------
rotate, varimax kaiser
Principal components/correlation Number of obs = 158
Number of comp. = 3
Trace = 12
Rotation: orthogonal varimax (Kaiser on) Rho = 0.5382
--------------------------------------------------------------------------
Component | Variance Difference Proportion Cumulative
-------------+------------------------------------------------------------
Comp1 | 2.95242 .867357 0.2460 0.2460
Comp2 | 2.08506 .66433 0.1738 0.4198
Comp3 | 1.42073 . 0.1184 0.5382
--------------------------------------------------------------------------
Rotated components
----------------------------------------------------------
Variable | Comp1 Comp2 Comp3 | Unexplained
-------------+------------------------------+-------------
bewert_sfu_a | 0.4076 -0.0266 -0.2829 | .4779
bewert_sfu_b | 0.3116 -0.3063 -0.3648 | .3129
bewert_sfu_c | -0.0255 0.4536 -0.1302 | .4642
bewert_sfu_d | 0.4007 -0.0456 0.0218 | .4715
bewert_sfu_e | 0.4392 0.0965 0.1618 | .4202
bewert_sfu_f | 0.0698 0.2650 -0.4451 | .5227
bewert_sfu_g | 0.4531 0.0973 0.0005 | .4728
bewert_sfu_h | -0.1026 0.5023 0.0011 | .328
bewert_sfu_i | 0.1350 -0.0261 0.4684 | .5588
bewert_sfu_j | 0.1927 0.5856 0.0731 | .4457
bewert_sfu_k | 0.3026 -0.1048 0.1037 | .5832
bewert_sfu_l | 0.1224 0.0410 0.5564 | .4839
----------------------------------------------------------
Component rotation matrix
--------------------------------------------
| Comp1 Comp2 Comp3
-------------+------------------------------
Comp1 | 0.7942 -0.5573 0.2422
Comp2 | 0.5724 0.5523 -0.6061
Comp3 | 0.2040 0.6200 0.7576
--------------------------------------------
SPSS Code:
FACTOR
/VARIABLES bewert_sfu_a bewert_sfu_b bewert_sfu_c bewert_sfu_d bewert_sfu_e bewert_sfu_f bewert_sfu_g bewert_sfu_h bewert_sfu_i bewert_sfu_j bewert_sfu_k bewert_sfu_l
/MISSING LISTWISE
/ANALYSIS bewert_sfu_a bewert_sfu_b bewert_sfu_c bewert_sfu_d bewert_sfu_e bewert_sfu_f bewert_sfu_g bewert_sfu_h bewert_sfu_i bewert_sfu_j bewert_sfu_k bewert_sfu_l
/PRINT EXTRACTION ROTATION
/FORMAT BLANK(.40)
/CRITERIA MINEIGEN(1) ITERATE(50)
/EXTRACTION PC
/CRITERIA ITERATE(50)
/ROTATION VARIMAX
/METHOD=CORRELATION.
Descriptive Statistics
Mean Std. Deviation Analysis N
bewert_sfu_a 3.79 .452 158
bewert_sfu_b 3.68 .506 158
bewert_sfu_c 1.61 .827 158
bewert_sfu_d 3.32 .619 158
bewert_sfu_e 3.03 .643 158
bewert_sfu_f 2.61 .812 158
bewert_sfu_g 3.32 .621 158
bewert_sfu_h 1.53 .796 158
bewert_sfu_i 2.10 .838 158
bewert_sfu_j 2.53 .819 158
bewert_sfu_k 3.29 .784 158
bewert_sfu_l 2.78 .842 158`
Component Matrix a
Component
1 2 3
bewert_sfu_a .531 .463
bewert_sfu_b .649 -.437
bewert_sfu_c -.599
bewert_sfu_d .687
bewert_sfu_e .658
bewert_sfu_f .541
bewert_sfu_g .602
bewert_sfu_h -.711
bewert_sfu_i .463
bewert_sfu_j .462 .497
bewert_sfu_k .637
bewert_sfu_l .412 .513
Extraction Method: Principal Component Analysis.
a 3 components extracted.
Communalities
Extraction
bewert_sfu_a .522
bewert_sfu_b .687
bewert_sfu_c .536
bewert_sfu_d .529
bewert_sfu_e .580
bewert_sfu_f .477
bewert_sfu_g .527
bewert_sfu_h .672
bewert_sfu_i .441
bewert_sfu_j .554
bewert_sfu_k .417
bewert_sfu_l .516
Extraction Method: Principal Component Analysis.
Rotated Component Matrix a
Component
1 2 3
bewert_sfu_a .705
bewert_sfu_b .673 -.448
bewert_sfu_c .627
bewert_sfu_d .671
bewert_sfu_e .661
bewert_sfu_f -.576
bewert_sfu_g .699
bewert_sfu_h .698
bewert_sfu_i .630
bewert_sfu_j .742
bewert_sfu_k .528
bewert_sfu_l .707
Extraction Method: Principal Component Analysis.
Rotation Method: Varimax with Kaiser Normalization.
a Rotation converged in 5 iterations.
Component Transformation Matrix
Component 1 2 3
1 .765 -.476 .434
2 .644 .567 -.513
3 -.001 .672 .741
Extraction Method: Principal Component Analysis.
Rotation Method: Varimax with Kaiser Normalization.
Stata factor analysis/correlation
Number of obs = 158
Method: principal-component factors
Retained factors = 3
Rotation: (unrotated)
Number of params = 33
--------------------------------------------------------------------------
Factor | Eigenvalue Difference Proportion Cumulative
-------------+------------------------------------------------------------
Factor1 | 3.87230 2.46548 0.3227 0.3227
Factor2 | 1.40682 0.22772 0.1172 0.4399
Factor3 | 1.17910 0.20674 0.0983 0.5382
Factor4 | 0.97236 0.16916 0.0810 0.6192
Factor5 | 0.80319 0.05087 0.0669 0.6861
Factor6 | 0.75232 0.09537 0.0627 0.7488
Factor7 | 0.65696 0.01376 0.0547 0.8036
Factor8 | 0.64320 0.13589 0.0536 0.8572
Factor9 | 0.50730 0.04359 0.0423 0.8995
Factor10 | 0.46371 0.07491 0.0386 0.9381
Factor11 | 0.38881 0.03488 0.0324 0.9705
Factor12 | 0.35393 . 0.0295 1.0000
--------------------------------------------------------------------------
LR test: independent vs. saturated: chi2(66) = 453.95 Prob>chi2 = 0.0
Factor loadings (pattern matrix) and unique variances
-----------------------------------------------------------
Variable | Factor1 Factor2 Factor3 | Uniqueness
-------------+------------------------------+--------------
bewert_sfu_a | 0.5314 0.4627 -0.1603 | 0.4779
bewert_sfu_b | 0.6490 0.2732 -0.4373 | 0.3129
bewert_sfu_c | -0.5994 0.3735 0.1926 | 0.4642
bewert_sfu_d | 0.6866 0.2265 0.0760 | 0.4715
bewert_sfu_e | 0.6576 0.2451 0.2954 | 0.4202
bewert_sfu_f | -0.3938 0.5409 -0.1723 | 0.5227
bewert_sfu_g | 0.6015 0.3710 0.1663 | 0.4728
bewert_sfu_h | -0.7107 0.2586 0.3163 | 0.3280
bewert_sfu_i | 0.4629 -0.2622 0.3977 | 0.5588
bewert_sfu_j | -0.3062 0.4619 0.4971 | 0.4457
bewert_sfu_k | 0.6373 0.0623 0.0818 | 0.5832
bewert_sfu_l | 0.4116 -0.2900 0.5125 | 0.4839
rotate, varimax kaiser blanks(.4)
Factor analysis/correlation
Number of obs = 158
Method: principal-component factors
Retained factors = 3
Rotation: orthogonal varimax (Kaiser on)
Number of params = 33
--------------------------------------------------------------------------
Factor | Variance Difference Proportion Cumulative
-------------+------------------------------------------------------------
Factor1 | 2.84986 0.98705 0.2375 0.2375
Factor2 | 1.86281 0.11727 0.1552 0.3927
Factor3 | 1.74554 . 0.1455 0.5382
--------------------------------------------------------------------------
LR test: independent vs. saturated: chi2(66) = 453.95 Prob>chi2 = 0.0000
Rotated factor loadings (pattern matrix) and unique variances
-----------------------------------------------------------
Variable | Factor1 Factor2 Factor3 | Uniqueness
-------------+------------------------------+--------------
bewert_sfu_a | 0.7047 -0.0983 -0.1258 | 0.4779
bewert_sfu_b | 0.6732 -0.4479 -0.1827 | 0.3129
bewert_sfu_c | -0.2184 0.6266 -0.3090 | 0.4642
bewert_sfu_d | 0.6710 -0.1473 0.2377 | 0.4715
bewert_sfu_e | 0.6605 0.0245 0.3781 | 0.4202
bewert_sfu_f | 0.0474 0.3785 -0.5761 | 0.5227
bewert_sfu_g | 0.6989 0.0358 0.1935 | 0.4728
bewert_sfu_h | -0.3776 0.6976 -0.2067 | 0.3280
bewert_sfu_i | 0.1847 -0.1019 0.6298 | 0.5588
bewert_sfu_j | 0.0624 0.7419 -0.0018 | 0.4457
bewert_sfu_k | 0.5276 -0.2131 0.3050 | 0.5832
bewert_sfu_l | 0.1273 -0.0160 0.7069 | 0.4839
-----------------------------------------------------------
Factor rotation matrix
-----------------------------------------
| Factor1 Factor2 Factor3
-------------+---------------------------
Factor1 | 0.7650 -0.4761 0.4336
Factor2 | 0.6440 0.5672 -0.5134
Factor3 | -0.0016 0.6720 0.7406
-----------------------------------------
Best Answer
You are correct. Stata is weird about this. Stata gives different results from SAS, R and SPSS, and it is difficult (in my opinion) to understand why without delving quite deep into the world of factor analysis and PCA.
Here's how you know that something weird is happening. The sum of the squared loadings for a component are equal to the eigenvalue for that component.
Pre-and post-rotation, the eigenvalues change, but the total eigenvalues don't change. Add up the sum of the squared loadings from your output (this is why I asked you to remove the blanks in my comment). With Stata's default, the sum of squared loadings will sum to 1.00 (within rounding error). With SPSS (and R, and SAS, and every other factor analysis program I've looked at) they will sum to the eigenvalue for that factor. (Post rotation eigenvalues change, but the sum of eigenvalues stays the same). The sum of squared loadings in SPSS is equal to the sum of the eigenvalues (i.e. 3.8723 + 1.40682), both pre- and post-rotation.
In Stata, the sum of the squared loadings for each factor is equal to 1.00, and so Stata has rescaled the loadings.
The only mention of this (that I have found) in the Stata documentation is in the estat loadings section of the help, where it says:
However, this appears to apply only to the unrotated component matrix, not the component rotated matrix. I can't get the unnormalized rotated matrix after PCA.
The people at Stata seem to know what they are doing, and usually have a good reason for doing things the way that they do. This one is beyond me though.
(For future reference, it would have made my life easier if you'd used a dataset that I could access, and if you'd included all output, without blanks).
Edit: My usual go-to site for information about how to get the same results for different programs is the UCLA IDRE. They don't cover PCA in Stata: http://www.ats.ucla.edu/stat/AnnotatedOutput/ I have to wonder if that's because they couldn't get the same result. :)