Solved – Determining number of factors in exploratory factor analysis

degrees of freedomfactor analysisr

I'm using R's factanal function for factor analysis. I know from reading that there are various ways of picking how many factors to use in the analysis. I don't know which to choose, or how to do any of them.

Here's the data I have so far from factanal. I don't understand what SS loadings are, or why degrees_of_freedom is not = min(#rows,#columns) – #factors.

Just judging from the Cumulative Var, which I think I understand, I would guess that 2 is the right number of factors, but am I right? And if so, how can I convince others that this is the right number of factors?

factanal(x = charges[3:8], factors = 1)

#                Factor1
# SS loadings      4.779
# Proportion Var   0.797

Test of the hypothesis that 1 factor is sufficient.
The chi square statistic is 279.13 on 9 degrees of freedom.
The p-value is 6.9e-55

factanal(x = charges[3:8], factors = 2, scores = "regression")

#                Factor1 Factor2
# SS loadings      2.817   2.544
# Proportion Var   0.470   0.424
# Cumulative Var   0.470   0.894

Test of the hypothesis that 2 factors are sufficient.
The chi square statistic is 77.1 on 4 degrees of freedom.
The p-value is 7.15e-16

factanal(x = charges[3:8], factors = 3)

#                Factor1 Factor2 Factor3
# SS loadings      2.769   2.618   0.063
# Proportion Var   0.461   0.436   0.010
# Cumulative Var   0.461   0.898   0.908

The degrees of freedom for the model is 0 and the fit was 0.1047

Best Answer

There are several approaches to determining the number of factors to extract for exploratory factor analysis (EFA). However, practically all of them boil down to be either visual, or analytical.

Visual approaches are mostly based on visual representation of factors' eigenvalues (so called scree plot - see this page and this page), depending on extracted factor number. A scree plot allows to determine the number of factors to extract by detecting an area, where the curve makes relatively sharp drop (called "elbow"). Note that the scree plot term is also applicable to principal component analysis (PCA) - for a basic example, see this page.

Analytical approaches are based on various criteria and heuristics, including Kaiser criterion (eigenvalue greater than one), variance explained criterion (this heuristic's cut-off values vary from 0.8-0.9 to as low as 0.5, depending on researcher's specific goals), parallel analysis, Very Simple Structure (VSS) criterion, Velicer's MAP test and other techniques (see more details here and here, as well as via links within).

While I've tried to answer your question briefly and give mainly an overview of the topic, there are many nice answers to similar or related questions on Cross Validated, which I highly recommend you to review. For example, for PCA vs. EFA basic arguments, see this discussion. For much more advanced arguments on this topic, see this and this discussion. For applying VSS criterion, using R, see this discussion. For parallel analysis, see this discussion.

Related Question