Factor Analysis – How to Correctly Interpret Parallel Analysis in Exploratory Factor Analysis?

factor analysisparallel-analysispsychometrics

Some scientific papers report results of parallel analysis of principal axis factor analysis in a way inconsistent with my understanding of the methodology. What am I missing? Am I wrong or are they.

Example:

  • Data: The performance of 200 individual humans has been observed on 10 tasks. For each individual and each task, one has a performance score. The question now is to determine how many factors are the cause for the performance on the 10 tasks.
  • Method: parallel analysis to determine the number of factors to retain in a principal axis factor analysis.
  • Example for reported result: “parallel analysis suggests that only factors with eigenvalue of 2.21 or more should be retained”

That is nonsense, isn’t it?

From the original paper by Horn (1965) and tutorials like Hayton et al. (2004) I understand that parallel analysis is an adaptation of the Kaiser criterion (eigenvalue > 1) based on random data. However, the adaptation is not to replace the cut-off 1 by another fixed number but an individual cut-off value for each factor (and dependent on the size of the data set, i.e. 200 times 10 scores). Looking at the examples by Horn (1965) and Hayton et al. (2004) and the output of R functions fa.parallel in the psych package and parallel in the nFactors package, I see that parallel analysis produces a downward sloping curve in the Scree plot to compare to the eigenvalues of the real data. More like “Retain the first factor if its eigenvalue is > 2.21; additionally retain the second if its eigenvalue is > 1.65; …”.

Is there any sensible setting, any school of thought, or any methodology that would render “parallel analysis suggests that only factors with eigenvalue of 2.21 or more should be retained” correct?

References:

Hayton, J.C., Allen, D.G., Scarpello, V. (2004). Factor retention decisions in exploratory factor analysis: a tutorial on parallel analysis. Organizational Research Methods, 7(2):191-205.

Horn, J.L. (1965). A rationale and test for the number of factors in factor analysis. Psychometrika, 30(2):179-185.

Best Answer

There are two equivalent ways to express the parallel analysis criterion. But first I need to take care of a misunderstanding prevalent in the literature.

The Misunderstanding
The so-called Kaiser rule (Kaiser didn't actually like the rule if you read his 1960 paper) eigenvalues greater than one are retained for principal component analysis. Using the so-called Kaiser rule eigenvalues greater than zero are retained for principal factor analysis/common factor anlaysis. This confusion has arisen over the years because several authors have been sloppy about using the label "factor analysis" to describe "principal component analysis," when they are not the same thing.

See Gently Clarifying the Application of Horn’s Parallel Analysis to Principal Component Analysis Versus Factor Analysis for the math of it if you need convincing on this point.

Parallel Analysis Retention Criteria
For principal component analysis based on the correlation matrix of $p$ number of variables, you have several quantities. First you have the observed eigenvalues from an eigendecomposition of the correlation matrix of your data, $\lambda_{1}, \dots, \lambda_{p}$. Second, you have the mean eigenvalues from eigendecompositions of the correlation matrices of "a large number" of random (uncorrelated) data sets of the same $n$ and $p$ as your own, $\bar{\lambda}^{\text{r}}_{1},\dots,\bar{\lambda}^{\text{r}}_{p}$.

Horn also frames his examples in terms of "sampling bias" and estimates this bias for the $q^{\text{th}}$ eigenvalue (for principal component analysis) as $\varepsilon_{q} = \bar{\lambda}^{\text{r}}_{q} - 1$. This bias can then be used to adjust observed eigenvalues thus: $\lambda^{\text{adj}}_{q} = \lambda_{q} - \varepsilon_{q}$

Given these quantities you can express the retention criterion for the $q^{\text{th}}$ observed eigenvalue of a principal component parallel analysis in two mathematically equivalent ways:

$\lambda^{\text{adj}}_{q} \left\{\begin{array}{cc} > 1 & \text{Retain.} \\\\ \le 1 & \text{Not retain.} \end{array}\right.$

$\lambda_{q} \left\{\begin{array}{cc} > \bar{\lambda}^{\text{r}}_{q} & \text{Retain.} \\\\ \le \bar{\lambda}^{\text{r}}_{q} & \text{Not retain.} \end{array}\right.$

What about for principal factor analysis/common factor analysis? Here we have to bear in mind that the bias is the corresponding mean eigenvalue: $\varepsilon_{q} = \bar{\lambda}^{\text{r}}_{q} - 0 = \bar{\lambda}^{\text{r}}_{q}$ (minus zero because the Kaiser rule for eigendecomposition of the correlation matrix with the diagonal replaced by the communalities is to retain eigenvalues greater than zero). Therefore here $\lambda^{\text{adj}}_{q} = \lambda_{q} - \bar{\lambda}^{\text{r}}_{q}$.

Therefore the retention criteria for principal factor analysis/common factor analysis ought be expressed as:

$\lambda^{\text{adj}}_{q} \left\{\begin{array}{cc} > 0 & \text{Retain.} \\\\ \le 0 & \text{Not retain.} \end{array}\right.$

$\lambda_{q} \left\{\begin{array}{cc} > \bar{\lambda}^{\text{r}}_{q} & \text{Retain.} \\\\ \le \bar{\lambda}^{\text{r}}_{q} & \text{Not retain.} \end{array}\right.$

Notice that the second form of expressing the retention criterion is consistent for both principal component analysis and common factor analysis (i.e. because the definition of $\lambda^{\text{adj}}_{q}$ changes depending on components/factors, but the second form of retention criterion is not expressed in terms of $\lambda^{\text{adj}}_{q}$).

one more thing...
Both principal component analysis and principal factor analysis/common factor analysis can be based on the covariance matrix rather than the correlation matrix. Because this changes the assumptions/definitions about the total and common variance, only the second forms of the retention criterion ought to be used when basing one's analysis on the covariance matrix.

Related Question