Solved – Likert scales: analyzing scales or items

hypothesis testinglikertspss

I am writing an essay based on a survey I made.

First I set my hypothesis = Who is an impulsive buyer tend to be loyal to shop X.

Than using "marketing scales book" I set different scales for each element:

  • Impulsive buying tendency: I tend to buy things without thinking, I buy on impulse.
  • Store loyalty: I love spending my time at store X, I am a loyal customer of store X, I feel in touch with store X.

Than I asked how they agreed 1-7 for each one.

Now I have to test my hypothesis… but witch is the best way?

Is there a method to test all 2 scales representing IMPULSIVENESS VS 3 scales representing LOYALTY one shot, as a group so it will be fully representative
or I have to test every single scale of impulsiveness vs. single scale of loyalty and than, I made a mean?

Best Answer

Generally speaking, you would want to use the scale scores (e.g. the mean rating across all three impulsiveness items, etc.) in any subsequent analysis (correlation, test of differences, etc.)

The reason for that is that having several items increases the reliability (and therefore ensures a lower standard error of measurement, more precise estimation, higher statistical power, etc.). Intuitively, item-specific error variance “averages out” and the scale score provides a better measure of the underlying construct with a higher “resolution” than the original 1-7 format. This is the whole point of multi-item scales (and, in fact, of Rensis Likert's work as he focused on the combination of several items and not particularly on the item's response format commonly associated with his name).

Of course, this only works well if the items really do measure the same thing and satisfy some other assumptions. Traditional scale building techniques (factor analysis, etc.) are intended to document that and select “good” items to include in a scale. Since what you have is apparently an ad hoc scale with a limited number of items and probably a small number of observations, there is not much else you can do but assume the items really do form a reasonable scale based on your personal judgment of their meaning.

The worse thing that could happen is not so much a failure to find a meaningful effect (which is of course perfectly possible but presents less risk of being misinterpreted). Rather, it would be to find an effect that looks interesting but is in fact driven by one or two items and doesn't exactly means what you think it means based on your interpretation of the whole scale.

Alternatively you could still analyze each item separately (by computing correlations if I understood you correctly) but you will probably run into several problems: reduced power/attenuated correlations, potentially conflicting results, etc. If there are some meaningful differences between the items, this could be the only way to make sense of the results (imagine, say, that “spending time in store X” is mainly driven by interior design and “loyal customer” by pricing, what sense does it make to subsume them in a single measure?).

Related Question