Solved – the null hypothesis in the Mann-Whitney test

nonparametricwilcoxon-mann-whitney-test

Let $X_1$ be a random value from distribution 1 and let $X_2$ be a random value from distribution 2. I thought that the null hypothesis for the Mann-Whitney test was $P(X_1 < X_2) = P(X_2 < X_1)$.

If I run simulations of the Mann-Whitney test on data from normal distributions with equal means and equal variances, with $\alpha=0.05$, I get Type I error rates which are very close to 0.05. However, if I make the variances unequal (but leave the means equal), the proportion of simulations in which the null hypothesis is rejected becomes larger than 0.05, which I didn't expect, since $P(X_1 < X_2) = P(X_2 < X_1)$ still holds. This happens when I use wilcox.test in R, regardless of whether I have exact=TRUE, exact=FALSE, correct=TRUE, or exact=FALSE, correct=FALSE.

Is the null hypothesis something different from what I've written above, or is it just that the test is inaccurate in terms of Type I error if the variances are unequal?

Best Answer

From Hollander & Wolfe pp 106-7,

Let $F$ be the distribution function corresponding to population 1 and $G$ be the distribution function corresponding to population 2. The null hypothesis is: $H_O: F(t)=G(t)$ for every $t$. The null hypothesis asserts that the $X$ variable and the $Y$ variable have the same probability distribution, but the common distribution is not specified.

Strictly speaking this describes the Wilcoxon test, but $U=W-\frac{n(n+1)}{2}$, so they're equivalent.

Related Solutions

Solved – Mann-Whitney null hypothesis under unequal variance

The Mann-Whitney test is a special case of a permutation test (the distribution under the null is derived by looking at all the possible permutations of the data) and permutation tests have the null as identical distributions, so that is technically correct.

One way of thinking of the Mann-Whitney test statistic is a measure of the number of times a randomly chosen value from one group exceeds a randomly chosen value from the other group. So the P(X>Y)=0.5 also makes sense and this is technically a property of the equal distributions null (assuming continuous distributions where the probability of a tie is 0). If the 2 distributions are the same then the probability of X being Greater than Y is 0.5 since they are both drawn from the same distribution.

The stated case of 2 distributions having the same mean but widely different variances matches with the 2nd null hypothesis, but not the 1st of identical distributions. We can do some simulation to see what happens with the p-values in this case (in theory they should be uniformly distributed):

> out <- replicate( 100000, wilcox.test( rnorm(25, 0, 2), rnorm(25,0,10) )$p.value )
> hist(out)
> mean(out < 0.05)
[1] 0.07991
> prop.test( sum(out<0.05), length(out), p=0.05 )

        1-sample proportions test with continuity correction

data:  sum(out < 0.05) out of length(out), null probability 0.05
X-squared = 1882.756, df = 1, p-value < 2.2e-16
alternative hypothesis: true p is not equal to 0.05
95 percent confidence interval:
 0.07824054 0.08161183
sample estimates:
      p 
0.07991

So clearly this is rejecting more often than it should and the null hypothesis is false (this matches equality of distributions, but not prob=0.5).

Thinking in terms of probability of X > Y also runs into some interesting problems if you ever compare populations that are based on Efron's Dice.

Solved – Unequal variances t-test or U Mann-Whitney test

The Mann-Whitney doesn't require equal variances unless you're specifically looking for location-shift alternatives.

In particular, it is able to test whether the probability of values in the first group are larger than the values in the second group, which is quite a general alternative that sounds like it's related to your original question.

Not only can the Mann-Whitney deal with transformed-location shifts very well (e.g. a scale-shift is a location-shift in the logs), it has power against any alternative that makes $P(X>Y)$ differ from $\frac{1}{2}$.

The Mann-Whitney U-statistic counts the number of times a value in one sample exceeds a value in the other. That's a scaled estimate of the probability that a random value from one population exceeds the other.

shift in P(X<Y) from 1/2

There's more detail here.

Also see the discussion here.

As for which is better, well, that really depends on a number of things. If the data are even a little more heavy-tailed than normal, you may be better with the Mann-Whitney, but it depends on the situation - discreteness and skewness can both complicate that situation, and it also depends on the precise alternatives of interest.

Best Answer

Related Solutions

Solved – Mann-Whitney null hypothesis under unequal variance

Solved – Unequal variances t-test or U Mann-Whitney test

Related Question