(to long for a comment, so I guess it's an answer)
I'm not sure what makes you assert there's a substantive difference between the two cases. When you use Mann-Whitney for testing location-shift alternatives, the assumption is of identical distributions aside from the possible location shift. It's not actually necessary to assume identical distributions. The Mann-Whitney, is, for example, perfectly appropriate for testing scale shift alternatives, or a host of other alternatives, as long as you can compute the distribution of the test statistic under the null. If your rank-based anova is to have a distribution you can compute under $H_0$, you'll need at least some assumptions for the null case there also.
If your assumptions for both are the same (such as both being applied to shift alternatives) and you compute the null distribution on an ANOVA for 2 groups of ranks correctly, your p-values will be identical to the equivalent two-tailed Mann-Whitney, in the same way that $t^2 = F$ for an ordinary 2 group ANOVA compared to a two-tailed two-sample-t (the version with equal-variance).
if I had two groups and both had different, non-normal distributions, but I only wanted to test for a difference in location what test would be preferable? I was under the impression I could use a t-test on ranks, or a Welch t-test on ranks. However, if these tests are the similar to a M_W U test then I guess this is not the case.
It's somewhat of a tricky question, because if they're different shapes 'location difference' doesn't have an obvious meaning in the way it does when they're the same shape.
If you define some measure of location difference (like difference in means or difference in medians or median of pairwise differences or difference in minimum or whatever) then you can do something with it - e.g. try to compute a resampling based distribution, like a bootstrap distribution. It's important to be clear about what you are prepared to assume though.
A Mann-Whitney can be used for more general alternatives than a simple location shift. e.g. For continuous distributions, you can write the null in the form:
$P(X>Y) = \frac{1}{2}$
and the alternative as
$P(X>Y) \neq \frac{1}{2}\quad$ (for a two tailed test)
or
$P(X>Y) < \frac{1}{2}\quad$ (or "$>$", in either case as a one tailed test)
If I recall correctly, Conover's Practical Nonparametric Statistics presents them this way, for example.
The Note
in the help on the wilcox.test
function clearly explains why R's value is smaller than yours:
Note
The literature is not unanimous about the definitions of the Wilcoxon rank sum and Mann-Whitney tests. The two most common definitions correspond to the sum of the ranks of the first sample with the minimum value subtracted or not: R subtracts and S-PLUS does not, giving a value which is larger by m(m+1)/2 for a first sample of size m. (It seems Wilcoxon's original paper used the unadjusted sum of the ranks but subsequent tables subtracted the minimum.)
That is, the definition R uses is $n_1(n_1+1)/2$ smaller than the version you use, where $n_1$ is the number of observations in the first sample.
As for modifying the result, you could assign the output from wilcox.test
into a variable, say a
, and then manipulate a$statistic
- adding the minimum to its value and changing its name. Then when you print a
(e.g. by typing a
), it will look the way you want.
To see what I am getting at, try this:
a <- wilcox.test(x,y,correct=FALSE)
str(a)
So for example if you do this:
n1 <- length(x)
a$statistic <- a$statistic + n1*(n1+1)/2
names(a$statistic) <- "T.W"
a
then you get:
Wilcoxon rank sum test with continuity correction
data: x and y
T.W = 156.5, p-value = 0.006768
alternative hypothesis: true location shift is not equal to 0
It's quite common to refer to the rank sum test (whether shifted by $n_1(n_1+1)/2$ or not) as either $W$ or $w$ or some close variant (e.g. here or here). It also often gets called '$U$' because of Mann & Whitney. There's plenty of precedent for using $W$, so for myself I wouldn't bother with the line that changes the name of the statistic, but if it suits you to do so there's no reason why you shouldn't, either.
Best Answer
When ties are internal to a group, of course the result of assigning the average rank makes no difference compared to say breaking ties at random just as you point out, but it does matter when there are ties across groups.
When there are ties across groups, how you deal with it will matter and there are several choices.
The one you mention - giving the average of the ranks to all tied values - is common, but not the only way it is done.
That approach has the disadvantage that you no longer have a set of integers from 1 to n, so it mucks up the distribution of the ranks under the null. Even under a normal approximation, it affects the variance of the distribution, though the calculation of the adjusted variance is for many of the common procedures not so onerous; if ties are not heavy it will often still be a very good approximation.
Another approach is simply to break ties at random ... which has the virtue of simplicity, but means two people may come to different conclusions on the same data
A third approach is to break ties in all possible ways (or sample from the set of all possible ways if there's too many to do otherwise), and then combine the p-values in some way (if all p-values are on the same side of the significance level, there's no difficulty); taking the average is usually what is done.
I think the best approach is to go back to the permutation distribution. You lose the convenience of tables, but it's relatively easy to either enumerate (in small samples) or sample from (in larger samples) the permutation distribution (of the ranks); this is the "right" answer, really, and not hard to do with suitable software (it's rather easy in R in many of the common cases)