So, I ran an EFA on 60 items. Analysis resulted in 19 components with an eigenvalue of a score greater than 1. The only factors that theoretically make sense and that include more then 3 items have eigenvalues greater than 3 – can I use these first three components in my analysis or do I have to rerun the analysis?
Solved – Exploratory factor analysis and eigenvalues
eigenvaluesfactor analysis
Related Solutions
I do not have an original source, but it appears this topic is actually quite debated. Some argue that you can do parallel analysis from PCA eigenvalues when doing PAF/maximum likelihood EFA, while others suggest this is inappropriate.
B. P. O'Connor wrote the following in his macro for parallel analysis for PCA/PAF (people.ok.ubc.ca/brioconn/nfactors/nfactors.html):
Principal components eigenvalues are often used to determine the number of common factors. This is the default in most statistical software packages, and it is the primary practice in the literature. It is also the method used by many factor analysis experts, including Cattell, who often examined principal components eigenvalues in his scree plots to determine the number of common factors. But others believe this common practice is wrong. Principal components eigenvalues are based on all of the variance in correlation matrices, including both the variance that is shared among variables and the variances that are unique to the variables. In contrast, principal axis eigenvalues are based solely on the shared variance among the variables. The two procedures are qualitatively different. Some therefore claim that the eigenvalues from one extraction method should not be used to determine the number of factors for the other extraction method. The issue remains neglected and unsettled.
The two citations do not generally contradict each other and both look to me correct. The only underwork is in Perhaps you mean sum of squared loadings for a principal component, after rotation
one should better drop word "principal" since rotated components or factors are not "principal" anymore, to be rigorous. Also (important!) the second citation is correct only when "factor analysis" is actually PCA method (like it is in SPSS by default) and so factors are just principal components. But the table you present is not after PCA, and I wonder whether they are from the same text and wasn't there some misprint.
In the extraction summary table you display there was 23 variables analyzed. Eigenvalues of their correlation matrix are shown in the left section "Initial eigenvalues". No factors have been extracted yet. These eigenvalues correspond to the variances of Principal components (i.e. PCA was performed), not of factors. Adjective "initial" means "at the initiation point of the analysis" and does not imply that there must be some "final" eigenvalues.
The (default in SPSS) Kaiser rule "eigenvalues>1" was used to decide how many factors to extract, so, 4 factors will come. The "eigenvalues>1" rule is based on PCA's eigenvalues (i.e. the eigenvalues of the intact, input correlation matrix).
Extraction of them was done by Principal axis method and the matrix of loadings obtained. Sums of squared loadings in the matrix columns are the factors' variances after extraction. These values appear in the middle section of your table.
These numbers, generally, should not be called eigenvalues because factor extractions not necessarily are based right on the eigendecomposition of the input data - they are specific algorithms on their own. Even Principal axis method which does involve eigenvalues deal with eigenvalues of a repeatedly "trained" matrix, not an original correlation matrix.
But if you had been doing PCA instead of FA then the 4 numbers in the middle column would have been the 4 first eigenvalues identical to the 4 largest ones on the left: in PCA, no fitting take place and the extracted "latent variables" are the PCs themselves, which eigenvalues are their variances.
In the right section, sums of squared loading after rotaion of the factors are shown. The variances of these new, rotated factors. Please read more about rotated factors (or components), especially footnote 4, and that they are neither "principal" anymore nor this-one-to-that-one correspondent to the extracted ones. After rotation, "2nd" factor, for example, is not "2nd extracted factor, rotated". And it also could have greater variance than the "1st" one.
So,
- No, you can't speak of eigenvalues after rotation. No matter be it orthogonal or oblique.
- You can't even say - at least should better avoid - of eigenvalues even after extraction of factors unless these factors are principal components$^1$. (An instructive example showing confusion similar to yours with ML factor extraction.) Variances of factors are SS loadings, not eigenvalues, generally.
- Rotated factors don't correspond one-to-one to the extracted ones.
- The % of total variation explained by the factors is 40.477% in your example, not 50.317%. The first number is less because FA factors explain all the assumed common variation which is less than the portion of total variation skimmed by the same number of PCs. May say in your report, "The 4-factor solution is responsible for the common variance constituting 40.5% of the total variance; while 4 principal components would account for 50.3% of the total variance".
$^1$ (Before factor rotation) variances of factors (pr. components) are the eigenvalues of the correlation/covariance matrix of the data if the FA is PCA method; variances of factors are the eigenvalues of the reduced correlation/covariance matrix with final communalities on the diagonal, if the FA is PAF method of extraction; variances of factors do not correspond to eigenvalues of correlation/covariance matrix in other FA methods such as ML, ULS, GLS (see). In all cases, variances of orthogonal factors are the SS of the extracted/rotated - final - loadings.
Best Answer
This is not a stupid question, in fact its a very good question.
To quote William Revelle (author of the R package
psych
) who quotes Kaiser saying "Solving the number of factors problem is easy, I do it everyday before breakfast. But knowing the right solution is harder".The trouble with factor solutions is not that there are no criteria, it is that they often disagree. For example, parallel analysis, the VSS criterion (developed by Revelle linked above) and the scree test, as well as the eigenvalues greater than one typically suggest different numbers of factors, and it is difficult to know which one is "correct".
Typically (if I have enough data), I will split the dataset into a number of smaller pieces and apply multiple criteria to each subset (scree test, parallel analyses et al). I then apply the models developed to some of the other subsets, and test them using structural equation modelling (i.e. confirmatory factor analysis) to decide which one performs better on new data.
However, even after all that, the "best" in a statistical model sense is sometimes a model which even I don't believe. So, its a difficult problem.
There is some work in chemometrics which looks at this problem from a cross-validation perspective, a good (if mathematically dense) review can be found here.
So, to answer your original question, yes you can use five or (ten) factors to model your dataset, but you will have to justify your choice.
Additionally, you should ensure that you have tried both orthogonal and oblique rotations (as this may help to illuminate structure). It is also possible that some of the items are not a good fit for the scale you are using, and if you examine Cronbach's alpha for reliability you may be able to remove some ill-fitting items. Tabachnick and Fidell note that factors with three or less items are typically unstable so you may be able to eliminate some of them.
Some useful criteria and implementations in R for choosing the correct number of factors can be found in the
psych
package, thenFactors
package, as well as thebcd
package (which implements the Wold and Gabriel methods for choosing the rank of a matrix which are reviews in the paper I referenced above). I hope this helps.