Solved – Why would the results of PCA differ from a confirmatory factor analysis

factor analysispca

I have conducted a confirmatory factor analysis (CFA) to test the fit of a model with 5 factors and 5 items per factor. I used the modification indices to alter the model until I obtained statistics indicating an acceptable fit of the model to my data.

As others have done in my area I then conducted a principal components analysis (PCA) with a varimax rotation to check that items loaded on the factors identified by the CFA; however, not all of them did.

  • Is it unusual for the results of a PCA to be inconsistent with a confirmatory factor analysis?
  • Could this be because I have a small sample size ($n=96$)?
  • As the CFA model has 'acceptable fit', should I simply work with the scale structures from those results and not do the PCA?

Best Answer

What is the rationale of applying an exploratory/unsupervised method (PCA or FA with VARIMAX rotation) after having tested a confirmatory model, especially if this is done on the same sample?

In your CFA model, you impose constraints on your pattern matrix, e.g. some items are supposed to load on one factor but not on the others. A large modification index indicates that freeing a parameter or removing an equality constraint could result in better model fit. Item loadings are already available through your model fit.

On the contrary, in PCA or FA there is no such constraint, even following an orthogonal rotation (whose purpose is just to make factor more interpretable in that items would generally tend to load more heavily on a factor than on several ones). But, it is worth noting that these models are conceptually and mathematically different: the FA model is a measurement model, where we assume that there is some unique error attached to each item; this is not the case under the PCA framework. It is thus not surprising that you failed to replicate your factor structure, which may be an indication that there are possible item cross-loading, low item reliability, low stability in your factor structure, or the existence of a higher-order factor structure, that is enhanced by your low sample size.

In both case, but especially CFA, $N=96$ is a very limited sample size. Although some authors have suggested a ratio individuals:items of 5 to 10, this is merely the number of dimensions that is important. In your case, the estimation of your parameters will be noisy, and in the case of PCA you may expect fluctuations in your estimated loadings (just try bootstrap to get an idea of 95% CIs).