Solved – How to interpret the Mann-Whitney U when using R’s formula interface

nonparametricrwilcoxon-mann-whitney-test

Say we have the following data:

set.seed(123)
data <- data.frame(x = c(rnorm(50, 1, 1), rnorm(50, 5, 2)),
                   y = c(rep('A', 50),    rep('B', 50)))

Which yields the following boxplot (boxplot(data$x ~ data$y)):

boxplot

Now let's say I want to test if the two samples have the same location parameters (median and/or mean). In my real case, the data are clearly not normal, so I've decided to run the Wilcoxon-Mann-Whitney test, like this:

wilcox.test(data$x ~ data$y)

However, I would like the alternative hypothesis to be that B, data$y's "second" factor, comes from a distribution with higher position parameters. I've tried setting the alternative parameter to "greater" and "less", but apparently the alternative hypotheses are not what I'm looking for. For example, alternative = "greater" tells me "alternative hypothesis: true location shift is greater than 0"; alternative = "less" tells me "alternative hypothesis: true location shift is less than 0".

How can I tweak the wilcox.test() function in order to have the alternative hypothesis I want (B comes from a distribution with higher position parameters than A)? Or should I just use another test instead?

Best Answer

Technically, the reference category and the direction of the test depend on the way the factor variable is encoded. With your toy data:

> wilcox.test(x ~ y, data=data, alternative="greater")

    Wilcoxon rank sum test with continuity correction

data:  x by y 
W = 52, p-value = 1
alternative hypothesis: true location shift is greater than 0 

> wilcox.test(x ~ y, data=data, alternative="less")

    Wilcoxon rank sum test with continuity correction

data:  x by y 
W = 52, p-value < 2.2e-16
alternative hypothesis: true location shift is less than 0 

Notice that the W statistic is the same in both cases but the test uses opposite tails of its sampling distribution. Now let's look at the factor variable:

> levels(data$y)
[1] "A" "B"

We can recode it to make "B" the first level:

> data$y <- factor(data$y, levels=c("B", "A"))

Now we have:

> levels(data$y)
[1] "B" "A"

Note that we did not change the data themselves, just the way the categorical variable is encoded “under the hood”:

> head(data)
          x y
1 0.4395244 A
2 0.7698225 A
3 2.5587083 A
4 1.0705084 A
5 1.1292877 A
6 2.7150650 A

> aggregate(data$x, by=list(data$y), mean)
  Group.1        x
1       B 5.292817
2       A 1.034404

But the directions of the test are now inverted:

> wilcox.test(x ~ y, data=data, alternative="greater")

    Wilcoxon rank sum test with continuity correction

data:  x by y 
W = 2448, p-value < 2.2e-16
alternative hypothesis: true location shift is greater than 0 

The W statistic is different but the p-value is the same than for the alternative="less" test with the categories in the original order. With the original data, it could be interpreted as “the location shift from B to A is less than 0” and with the recoded data it becomes “the location shift from A to B is greater than 0” but this is really the same hypothesis (but see Glen_b's comments to the question for the correct interpretation).

In your case, it therefore seems that the test you want is alternative="less" (or, equivalently, alternative="greater" with the recoded data). Does that help?