Solved – Best factor extraction methods in factor analysis

factor analysispcaspss

SPSS offers several methods of factor extraction:

  1. Principal components (which isn't factor analysis at all)
  2. Unweighted least squares
  3. Generalized least squares
  4. Maximum Likelihood
  5. Principal Axis
  6. Alpha factoring
  7. Image factoring

Ignoring the first method, which isn't factor analysis (but principal component analysis, PCA), which of these methods is "the best"? What are the relative advantages of the different methods? And basically, how would I choose which one to use?

Additional question: should one obtain similar results from all 6 methods?

Best Answer

To make it short. The two last methods are each very special and different from numbers 2-5. They are all called common factor analysis and are indeed seen as alternatives. Most of the time, they give rather similar results. They are "common" because they represent classical factor model, the common factors + unique factors model. It is this model which is typically used in questionnaire analysis/validation.

Principal Axis (PAF), aka Principal Factor with iterations is the oldest and perhaps yet quite popular method. It is iterative PCA$^1$ application to the matrix where communalities stand on the diagonal in place of 1s or of variances. Each next iteration thus refines communalities further until they converge. In doing so, the method that seeks to explain variance, not pairwise correlations, eventually explains the correlations. Principal Axis method has the advantage in that it can, like PCA, analyze not only correlations, but also covariances and other SSCP measures (raw sscp, cosines). The rest three methods process only correlations [in SPSS; covariances could be analyzed in some other implementations]. This method is dependent on the quality of starting estimates of communalities (and it is its disadvantage). Usually the squared multiple correlation/covariance is used as the starting value, but you may prefer other estimates (including those taken from previous research). Please read this for more. If you want to see an example of Principal axis factoring computations, commented and compared with PCA computations, please look in here.

Ordinary or Unweighted least squares (ULS) is the algorithm that directly aims at minimizing the residuals between the input correlation matrix and the reproduced (by the factors) correlation matrix (while diagonal elements as the sums of communality and uniqueness are aimed to restore 1s). This is the straight task of FA$^2$. ULS method can work with singular and even not positive semidefinite matrix of correlations provided the number of factors is less than its rank, - although it is questionable if theoretically FA is appropriate then.

Generalized or Weighted least squares (GLS) is a modification of the previous one. When minimizing the residuals, it weights correlation coefficients differentially: correlations between variables with high uniqness (at the current iteration) are given less weight$^3$. Use this method if you want your factors to fit highly unique variables (i.e. those weakly driven by the factors) worse than highly common variables (i.e. strongly driven by the factors). This wish is not uncommon, especially in questionnaire construction process (at least I think so), so this property is advantageous$^4$.

Maximum Likelihood (ML) assumes data (the correlations) came from population having multivariate normal distribution (other methods make no such an assumption) and hence the residuals of correlation coefficients must be normally distributed around 0. The loadings are iteratively estimated by ML approach under the above assumption. The treatment of correlations is weighted by uniqness in the same fashion as in Generalized least squares method. While other methods just analyze the sample as it is, ML method allows some inference about the population, a number of fit indices and confidence intervals are usually computed along with it [unfortunately, mostly not in SPSS, although people wrote macros for SPSS that do it]. The general fit chi-square test asks if the factor-reproduced correlation matrix can pretend to be the population matrix of which the observed matrix is random sampled.

All the methods I briefly described are linear, continuous latent model. "Linear" implies that rank correlations, for example, should not be analyzed. "Continuous" implies that binary data, for example, should not be analyzed (IRT or FA based on tetrachoric correlations would be more appropriate).


$^1$ Because correlation (or covariance) matrix $\bf R$, - after initial communalities were placed on its diagonal, will usually have some negative eigenvalues, these are to be kept clean of; therefore PCA should be done by eigen-decomposition, not SVD.

$^2$ ULS method includes iterative eigendecomposition of the reduced correlation matrix, like PAF, but within a more complex, Newton-Raphson optimization procedure aiming to find unique variances ($\bf u^2$, uniquenesses) at which the correlations are reconstructed maximally. In doing so ULS appears equivalent to method called MINRES (only loadings extracted appear somewhat orthogonally rotated in comparison with MINRES) which is known to directly minimize the sum of squared residuals of correlations.

$^3$ GLS and ML algorithms are basically as ULS, but eigendecomposition on iterations is performed on matrix $\bf uR^{-1}u$ (or on $\bf u^{-1}Ru^{-1}$), to incorporate uniquenesses as weights. ML differs from GLS in adopting the knowledge of eigenvalue trend expected under normal distribution.

$^4$ The fact that correlations produced by less common variables are permitted to be fitted worse may (I surmise so) give some room for the presence of partial correlations (which need not be explained), what seems nice. Pure common factor model "expects" no partial correlations, which is not very realistic.

enter image description here