I have a 10 treatments, 5 samples per treatment, and the result of each treatment is ranked on a ordinal scale: Poor, Fair, Moderate, Good, Excellent. I have the data tallied up so I have the total number of observations for each category on the scale; it looks like this:
TreatA TreatB TreatC
Poor 2 2 1
Fair 1 1 1
Moder 1 2 0
Good 0 0 2
Exc 1 0 1
Is there a way to do a Mann-Whitney/Wilcoxon rank sum test on data in this format? I have been using the wilcox.test function, but it requires each observation to be listed individually, rather than totalled; that data would need to look more like
TreatA TreatB
Poor Poor
Poor Poor
Fair Fair
Exc Fair
The problem is that I actually have 25 treatments and the number of samples per treatment can be more than 60. I would prefer to use the data in the format I have it in currently.
Is there a function in R that will let me perform this test on my data as it is formatted?
Best Answer
Yes. Calculating the test statistic is simple. Your biggest problem will be the heavy ties.
There are several ways to get the data into a form that wilcox.test will take. Here's one:
At this point, the data is in a form you can use. You can use
rep
rather thaninverse.rle
(and it's probably a bit shorter that way) to expand counts to repetitions.e.g., you can do this:
or this:
but (as R warns you in the second case) the p-value with ties is not exact. And unfortunately ties are heavy. You could do a permutation test (/randomization test) and get a "simulated" p-value, which will effectively be exact at a sufficiently large number of resamples.
However, since your observations-per-sample are pretty large, you should be able to get away with a normal approximation with the variance adjusted for ties. There's sufficient information in say Conover's Practical Nonparametric Statistics to do this for oneself.