Solved – the best way of weighing cardinal scores and Likert scale scores to create a composite score

compositecount-datalikert

I can't thank the experts enough for their clarifications. One final question following my earlier posts on forming coomposite variables here, here, and here.

If I have measured some variables as cardinal numbers (e.g. 1, 2, 3, 4, 5 times) and others on a Likert scale, can I then just add the scores together?

For example:

  1. How many times do you visit the local hospital per month? Answer: 5
    times
  2. How would you rate the medical service provided to you: Answer
    3 on a scale of five going from poor to excellent.

Can I add 5 and 3 and say the composite score is 8?

What if the likert scale goes from negative to positive (e.g. strongly disagree = -2, disagree = -1, neither disagree or agree = 0, agree = 1 and strong agree = 2)?

Would the same rule as above apply if the score is say 0 or -2?

My objective is to keep the aggregation process simple and not get 'confused' in complex statistical formulas.

Best Answer

Combining likert items with different numeric scalings

  • Taking the sum or the mean of a set of items is standard practice in the behavioural and social sciences where each item is measured on the same response scale (e.g., a 1 to 5 likert scale).
  • If you add or substract a constant to the scaling of an item, this will not alter the scale from a correlational perspective. For example, if item 1 was 1,2,3,4,5 and item 2 was -2,-1,0,1,2, you could combine these two items to form a scale, and this version of the scale would be perfectly correlated with a version where you rescaled item 2 to have the same scaling as item 1.
  • That said, there are good reasons to use a consistent numeric scaling for all items. In particular, if the composite is the mean of a set of items on a consistent response scale (e.g., 1 to 5), then the mean for a sample provides a sense of where the sample tends to lie on the underlying response scale (e.g., a mean of 4.5 on a 5 point job satisfaction scale suggests that the sample is highly satisfied).
  • The sum and the mean will both be perfectly correlated. From an interpretation perspective, I prefer the mean; from the perspective of manually interpreting norm tables and avoiding decimal and rounding issues, the sum is sometimes preferable.
  • All the above advice is predicated on the idea that the items should be combined in the first place. See my answer to your previous question for a discussion of the broader issue of validity, and how to assess whether it is appropriate to combine items.

Combining count variables with likert items

  • Count variables typically have no upper limit. Thus, if you were combining a count variable with a likert item, there is the risk that the count variable could have much greater variance and thus importance than the likert item, if for example, the counts were sometimes large (e.g., 20 hospital visits in the last month).
  • There are several options for how you could scale a count variable to enable you to combine it with likert items.
    • In general, when mapping counts on to a psychological conception of frequency, I find that it is better to take the log of the counts (or log(counts + 1)) or some similar transformation that reduces the positive skew of the distribution.
    • One simple way of scaling the count to be comparable to a 5 point likert scale would be to devise five categories (e.g., 1=never, 2=occasionally, ..., 5=very often; or some such) and ask subject matter expects to assign cut-offs for each category (e.g., 0 visits is 1, 1 visit is 2, 2 to 4 visits is 3, 5 to 6 visits is 4, and 7+ visits is 5). Given that you want a simple process, this might be appealing.
    • You could apply factor analytic procedures that include both the count variable and the likert items to determine weights. If you do this, I'd use log(count + 1) or something similar instead of the raw count in your factor analysis.

Potential issues with combining items on different scales

  • From my observation, scale scores are typically derived from items that use the same response scales (e.g., agreement, frequency, importance, satisfaction, etc.). This can facilitate a clean interpretation of the scale scores. Mixing counts, agreement, satisfaction and items using other response formats, can raise questions over whether the composite is meaningful or pure. Thus, if you are mixing response scales, there is an additional onus on you to justify why you are combining the variables that you are combining.
  • For example, what are you measuring when you combine a variable that measures frequency of going to hospital and satisfaction with the hospital? The variables sound like two very separate things.