Factor Scores – How to Calculate Factor Scores from Discrete, Ordinal Responses Using Factor Analysis

factor analysisordinal-data

Is there a principled way to estimate factor scores when you have ordinal, discrete variables.

I have $n$ ordinal, discrete, variables. If I make the assumption that underlying each response is a continuous, normally distributed variable, then I can calculate an $n\times n$ polychoric correlation matrix. I can then run a factor analysis on this matrix and get factor loadings for each variable.

How can I combine the factor loadings and the variables to estimate the factor scores. The typical ways to estimate scores would appear to require that I treat the ordinal data as interval.

I suppose I might need to dig deeper into the guts of polychoric correlation to figure out a link function.

Best Answer

The 'principled' approach (that is to say the a priori defensible approach that may not empirically make much difference) is to use a graded response model, a rather useful member of the IRT family often used for Likert type items. The R package ltm makes this very straightforward.

You're then assuming there is a ordinal logistic regression relationship between the unobserved trait and each of your indicators. Choosing this model class allows you to take the ordinal nature of the indicators seriously and provides information about what part of the trait each item is most informative about. Like factor analysis, it gives you a standard error for the score, although FA people seem to ignore these for some reason.

On the other hand, choosing this model class limits your ability to do all the classic factor analysis stuff like rotating things until you like the look of them. I think this is a plus, but reasonable people disagree. If you're doing that sort of thing to find out how many 'scales' you have, you'll want to look at the Mokken procedures that try to identify scales, since the FA 'fit another dimension and rotate to simple structure' won't work.