Factor Analysis – How to Conduct Factor Analysis of Questionnaires with Likert Items

factor analysislikertpsychologypsychometricsscales

I used to analyse items from a psychometric point of view. But now I am trying to analyse other types of questions on motivation and other topics. These questions are all on Likert scales. My initial thought was to use factor analysis, because the questions are hypothesised to reflect some underlying dimensions.

  • But is factor analysis appropriate?
  • Is it necessary to validate each question regarding its dimensionality ?
  • Is there a problem with performing factor analysis on likert items?
  • Are there any good papers and methods on how to conduct factor analysis on Likert and other categorical items?

Best Answer

From what I've seen so far, FA is used for attitude items as it is for other kind of rating scales. The problem arising from the metric used (that is, "are Likert scales really to be treated as numeric scales?" is a long-standing debate, but providing you check for the bell-shaped response distribution you may handle them as continuous measurements, otherwise check for non-linear FA models or optimal scaling) may be handled by polytmomous IRT models, like the Graded Response, Rating Scale, or Partial Credit Model. The latter two may be used as a rough check of whether the threshold distances, as used in Likert-type items, are a characteristic of the response format (RSM) or of the particular item (PCM).

Regarding your second point, it is known, for example, that response distributions in attitude or health surveys differ from one country to the other (e.g. chinese people tend to highlight 'extreme' response patterns compared to those coming from western countries, see e.g. Song, X.-Y. (2007) Analysis of multisample structural equation models with applications to Quality of Life data, in Handbook of Latent Variable and Related Models, Lee, S.-Y. (Ed.), pp 279-302, North-Holland). Some methods to handle such situation off the top of my head:

  • use of log-linear models (marginal approach) to highlight strong between-groups imbalance at the item level (coefficients are then interpreted as relative risks instead of odds);
  • the multi-sample SEM method from Song cited above (Don't know if they do further work on that approach, though).

Now, the point is that most of these approaches focus at the item level (ceiling/floor effect, decreased reliability, bad item fit statistics, etc.), but when one is interested in how people deviate from what would be expected from an ideal set of observers/respondents, I think we must focus on person fit indices instead.

Such $\chi^2$ statistics are readily available for IRT models, like INFIT or OUTFIT mean square, but generally they apply on the whole questionnaire. Moreover, since estimation of items parameters rely in part on persons parameters (e.g., in the marginal likelihood framework, we assume a gaussian distribution), the presence of outlying individuals may lead to potentially biased estimates and poor model fit.

As proposed by Eid and Zickar (2007), combining a latent class model (to isolate group of respondents, e.g. those always answering on the extreme categories vs. the others) and an IRT model (to estimate item parameters and persons locations on the latent trait in both groups) appears a nice solution. Other modeling strategies are described in their paper (e.g. HYBRID model, see also Holden and Book, 2009).

Likewise, unfolding models may be used to cope with response style, which is defined as a consistent and content-independent pattern of response category (e.g. tendency to agree with all statements). In the social sciences or psychological literature, this is know as Extreme Response Style (ERS). References (1–3) may be useful to get an idea on how it manifests and how it may be measured.

Here is a short list of papers that may help to progress on this subject:

  1. Hamilton, D.L. (1968). Personality attributes associated with extreme response style. Psychological Bulletin, 69(3): 192–203.
  2. Greanleaf, E.A. (1992). Measuring extreme response style. Public Opinion Quaterly, 56(3): 328-351.
  3. de Jong, M.G., Steenkamp, J.-B.E.M., Fox, J.-P., and Baumgartner, H. (2008). Using Item Response Theory to Measure Extreme Response Style in Marketing Research: A Global Investigation. Journal of marketing research, 45(1): 104-115.
  4. Morren, M., Gelissen, J., and Vermunt, J.K. (2009). Dealing with extreme response style in cross-cultural research: A restricted latent class factor analysis approach
  5. Moors, G. (2003). Diagnosing Response Style Behavior by Means of a Latent-Class Factor Approach. Socio-Demographic Correlates of Gender Role Attitudes and Perceptions of Ethnic Discrimination Reexamined. Quality & Quantity, 37(3), 277-302.
  6. de Jong, M.G. Steenkamp J.B., Fox, J.-P., and Baumgartner, H. (2008). Item Response Theory to Measure Extreme Response Style in Marketing Research: A Global Investigation. Journal of Marketing Research, 45(1), 104-115.
  7. Javaras, K.N. and Ripley, B.D. (2007). An “Unfolding” Latent Variable Model for Likert Attitude Data. JASA, 102(478): 454-463.
  8. slides from Moustaki, Knott and Mavridis, Methods for detecting outliers in latent variable models
  9. Eid, M. and Zickar, M.J. (2007). Detecting response styles and faking in personality and organizational assessments by Mixed Rasch Models. In von Davier, M. and Carstensen, C.H. (Eds.), Multivariate and Mixture Distribution Rasch Models, pp. 255–270, Springer.
  10. Holden, R.R. and Book, A.S. (2009). Using hybrid Rasch-latent class modeling to improve the detection of fakers on a personality inventory. Personality and Individual Differences, 47(3): 185-190.