I have data from a survey comprised of several measures that used different Likert-type scaling (4-, 5-, and 6-point scales). I would like to run a principal components analysis using the data from these measures. It seems to me that I need to transform this data in some way so that the power of all items is equivalent prior to analysis. However, I am uncertain how to proceed.
Solved – Data transformation for Principal Components Analysis from different Likert scales
data transformationlikertpcapsychometricsscales
Related Solutions
From what I've seen so far, FA is used for attitude items as it is for other kind of rating scales. The problem arising from the metric used (that is, "are Likert scales really to be treated as numeric scales?" is a long-standing debate, but providing you check for the bell-shaped response distribution you may handle them as continuous measurements, otherwise check for non-linear FA models or optimal scaling) may be handled by polytmomous IRT models, like the Graded Response, Rating Scale, or Partial Credit Model. The latter two may be used as a rough check of whether the threshold distances, as used in Likert-type items, are a characteristic of the response format (RSM) or of the particular item (PCM).
Regarding your second point, it is known, for example, that response distributions in attitude or health surveys differ from one country to the other (e.g. chinese people tend to highlight 'extreme' response patterns compared to those coming from western countries, see e.g. Song, X.-Y. (2007) Analysis of multisample structural equation models with applications to Quality of Life data, in Handbook of Latent Variable and Related Models, Lee, S.-Y. (Ed.), pp 279-302, North-Holland). Some methods to handle such situation off the top of my head:
- use of log-linear models (marginal approach) to highlight strong between-groups imbalance at the item level (coefficients are then interpreted as relative risks instead of odds);
- the multi-sample SEM method from Song cited above (Don't know if they do further work on that approach, though).
Now, the point is that most of these approaches focus at the item level (ceiling/floor effect, decreased reliability, bad item fit statistics, etc.), but when one is interested in how people deviate from what would be expected from an ideal set of observers/respondents, I think we must focus on person fit indices instead.
Such $\chi^2$ statistics are readily available for IRT models, like INFIT or OUTFIT mean square, but generally they apply on the whole questionnaire. Moreover, since estimation of items parameters rely in part on persons parameters (e.g., in the marginal likelihood framework, we assume a gaussian distribution), the presence of outlying individuals may lead to potentially biased estimates and poor model fit.
As proposed by Eid and Zickar (2007), combining a latent class model (to isolate group of respondents, e.g. those always answering on the extreme categories vs. the others) and an IRT model (to estimate item parameters and persons locations on the latent trait in both groups) appears a nice solution. Other modeling strategies are described in their paper (e.g. HYBRID model, see also Holden and Book, 2009).
Likewise, unfolding models may be used to cope with response style, which is defined as a consistent and content-independent pattern of response category (e.g. tendency to agree with all statements). In the social sciences or psychological literature, this is know as Extreme Response Style (ERS). References (1–3) may be useful to get an idea on how it manifests and how it may be measured.
Here is a short list of papers that may help to progress on this subject:
- Hamilton, D.L. (1968). Personality attributes associated with extreme response style. Psychological Bulletin, 69(3): 192–203.
- Greanleaf, E.A. (1992). Measuring extreme response style. Public Opinion Quaterly, 56(3): 328-351.
- de Jong, M.G., Steenkamp, J.-B.E.M., Fox, J.-P., and Baumgartner, H. (2008). Using Item Response Theory to Measure Extreme Response Style in Marketing Research: A Global Investigation. Journal of marketing research, 45(1): 104-115.
- Morren, M., Gelissen, J., and Vermunt, J.K. (2009). Dealing with extreme response style in cross-cultural research: A restricted latent class factor analysis approach
- Moors, G. (2003). Diagnosing Response Style Behavior by Means of a Latent-Class Factor Approach. Socio-Demographic Correlates of Gender Role Attitudes and Perceptions of Ethnic Discrimination Reexamined. Quality & Quantity, 37(3), 277-302.
- de Jong, M.G. Steenkamp J.B., Fox, J.-P., and Baumgartner, H. (2008). Item Response Theory to Measure Extreme Response Style in Marketing Research: A Global Investigation. Journal of Marketing Research, 45(1), 104-115.
- Javaras, K.N. and Ripley, B.D. (2007). An “Unfolding” Latent Variable Model for Likert Attitude Data. JASA, 102(478): 454-463.
- slides from Moustaki, Knott and Mavridis, Methods for detecting outliers in latent variable models
- Eid, M. and Zickar, M.J. (2007). Detecting response styles and faking in personality and organizational assessments by Mixed Rasch Models. In von Davier, M. and Carstensen, C.H. (Eds.), Multivariate and Mixture Distribution Rasch Models, pp. 255–270, Springer.
- Holden, R.R. and Book, A.S. (2009). Using hybrid Rasch-latent class modeling to improve the detection of fakers on a personality inventory. Personality and Individual Differences, 47(3): 185-190.
I'm not sure about the first part of your question. But regarding the second bit: a reliability analysis of itself does not tell you if you have one underlying construct or several. You can have a high cronbach-alpha (for reliability) in the presence of two or more factors. Definitely, do the factor analysis as well as reliability. You might also want to check out the latent variable and item response literature. Some of these models are set up to handle dichotomous and polytomous outcomes - which might deal with the z-score problem as well.
Best Answer
As suggested by @whuber, you can "abstract" the scale effect by working with a standardized version of your data. If you're willing to accept that an interval scale is the support of each of your item (i.e. the distance between every two response categories would have the same meaning for every respondents), then linear correlations are fine. But you can also compute polychoric correlation to better account for the discretization of a latent variable (see the R package polycor). Of note, it's a largely more computer-intensive job, but it works quite well in R.
Another possibility is to combine optimal scaling within your PCA, as implemented in the homals package. The idea is to find a suitable non-linear transformation of each scale, and this is very nicely described by Jan de Leeuw in the accompagnying vignette or the JSS article, Gifi Methods for Optimal Scaling in R: The Package homals. There are several examples included.
For a more thorough understanding of this approach with any factorial method, see the work of Yoshio Takane in the 80s.
Similar points were raised by @Jeromy and @mbq on related questions, Does it ever make sense to treat categorical data as continuous?, How can I use optimal scaling to scale an ordinal categorical variable?