Displaying Ordinal Data – Means, Medians, and Mean Ranks Using SPSS

meanpresentationranksspsswilcoxon-mann-whitney-test

I have some ordinal data that is not normally distributed, so I decided to do non-parametric testing using the Mann-Whitney U Test. I am looking at differences between groups for seven scores – these scores are either 0, 1, 2, or 3 for each subject. I am having a difficult time figuring out how to display my data!

If I present the data using the medians (and IQR of medians), it's not clear at all where the differences are because for the most part the medians fall on either 0 or 1. So despite the Mann-Whitney U Test showing significant differences, the table just looks uninteresting.

I could also present the data using the means. There are some scientific papers out there which say that you can use means with ordinal data, but that you can't make the same type of assumptions about differences between scores (e.g. the difference between 0 and 1 is not the same as between 1 and 2). Using means would be a bit controversial, although the numbers in the table tell the story well when I use them.

A third option is using the mean ranks that SPSS gives me in the output of the Mann-Whitney. The mean ranks are what are being compared between groups, so perhaps I should just use those? Only problem I have with this is that the mean ranks don't really mean anything in regards to the actual data (e.g. I can't see that subjects are closer to a 3 while controls are closer to a 1 using mean ranks.)

And a last option was performing a chi-square analysis comparing subjects and controls after splitting the scores into two groups (0 and 1 for low and 2 and 3 for high). However, when I did this, the differences weren't as pronounced (probably for a number of reasons).

Best Answer

This is an excellent question. As you found, quantiles do not work when there are many ties in the data, because they are too discontinuous as estimators. I often find means work best, if you can assume that the spacing between the categories are at least "halfway meaningful." Exceedance probabilities are always valid. In your case these would be estimated by the proportion of observations $\geq 1, \geq 2, =3$. Mean ranks are useful when comparing groups, but I don't see as much use for a single variable.

The correctness of using the mean to summarize ordinal variables can seldom come from the data themselves. It is subjective.

Instead of using mean ranks I would use an appropriate rank correlation measure or the concordance probability (a simple linear translation of the Wilcoxon-Mann-Whitney statistic; it is the mean rank of observations in one of the two groups divided by a constant) between two variables (e.g., a binary grouping and an ordinal scale). Choices for correlation coefficients include Somers' $D_{xy}$ (which is consistent with the concordance probability and penalizes for ties on the ordinal variable) and Goodman-Kruskal $\gamma$ which doesn't penalize for ties on either $x$ or $y$.