AFAIK, there is no closed form for the distribution. Using R, the naive implementation of getting the exact distribution works for me up to group sizes of at least 12 - that takes less than 1 minute on a Core i5 using Windows7 64bit and current R. For R's own more clever algorithm in C that's used in pwilcox()
, you can check the source file src/nmath/wilcox.c
n1 <- 12 # size group 1
n2 <- 12 # size group 2
N <- n1 + n2 # total number of subjects
Now generate all possible cases for the ranks within group 1. These are all ${N \choose n_{1}}$ different samples from the numbers $1, \ldots, N$ of size $n_{1}$. Then calculate the rank sum (= test statistic) for each of these cases. Tabulate these rank sums to get the probability density function from the relative frequencies, the cumulative sum of these relative frequencies is the cumulative distribution function.
rankMat <- combn(1:N, n1) # all possible ranks within group 1
LnPl <- colSums(rankMat) # all possible rank sums for group 1
dWRS <- table(LnPl) / choose(N, n1) # relative frequencies of rank sums: pdf
pWRS <- cumsum(dWRS) # cumulative sums: cdf
Compare the exact distribution against the asymptotically correct normal distribution.
muLnPl <- (n1 * (N+1)) / 2 # expected value
varLnPl <- (n1*n2 * (N+1)) / 12 # variance
plot(names(pWRS), pWRS, main="Wilcoxon RS, N=(12, 12): exact vs. asymptotic",
type="n", xlab="ln+", ylab="P(Ln+ <= ln+)", cex.lab=1.4)
curve(pnorm(x, mean=muLnPl, sd=sqrt(varLnPl)), lwd=4, n=200, add=TRUE)
points(names(pWRS), pWRS, pch=16, col="red", cex=0.7)
abline(h=0.95, col="blue")
legend(x="bottomright", legend=c("exact", "asymptotic"),
pch=c(16, NA), col=c("red", "black"), lty=c(NA, 1), lwd=c(NA, 2))
In scipy.stats, the Mann-Whitney U test compares two populations:
Computes the Mann-Whitney rank test on samples x and y.
but the Wilcoxon test compares two PAIRED populations:
The Wilcoxon signed-rank test tests the null hypothesis that two
related paired samples come from the same distribution. In particular,
it tests whether the distribution of the differences x - y is
symmetric about zero. It is a non-parametric version of the paired
T-test.
EDITED / CORRECTED in response to ttnphns' comments.
Note that the t does not test for whether the distribution of the differences is symmetric about zero, so the Wilcoxon signed rank test is not truly a non-parametric counterpart of the paired t test.
The Mann-Whitney test, on the other hand, assumes that all the observations are independent of each other (no basis for pairing here!). It also assumes that the two distributions are the same, and the alternative is that one is stochastically greater than the other. If we make the additional assumption that the only difference between the two distributions is their location, and the distributions are continuous, then "stochastically greater than" is equivalent to such statements as "the medians are different", so you can, with the extra assumption(s), interpret it that way.
The Mann-Whitney uses a continuity correction by default, but the Wilcoxon doesn't.
The Mann-Whitney handles ties using the midrank, but the Wilcoxon offers three options for handling ties in the paired values (i.e., zero difference between the two elements of the pair.)
It sounds like the Wilcoxon test is the more appropriate for your purposes, since you do have that lack of independence between all observations. However, one might imagine that requests with similar, but not equal, lengths might exhibit similar behavior, whereas the Wilcoxon would assume that if they aren't paired, they are independent. A logistic regression model might serve you better in this case.
Quotes are from the scipy.stats doc pages, which we aren't supposed to link to, apparently.
Best Answer
The Streitberg-Röhmel shift algorithm is described in two manuscripts:
Streitberg B, Röhmel J (1986). "Exact Distributions for Permutation and Rank Tests: An Introduction to Some Recently Published Algorithms." Statistical Software Newsletter, 12(1), 10-17. ISSN 1609-3631.
Streitberg B, Röhmel J (1987). "Exakte Verteilungen für Rang- und Randomisierungstests im allgemeinen c-Stichprobenfall." EDV in Medizin und Biologie, 18(1), 12-19.
Both are not exactly mainstream journals and one manuscript is in German...which explains why this algorithm is less well-known than the network algorithm by Mehta & Patel underlying their proprietary StatXact software.
The Streitberg-Röhmel shift algorithm (and Van de Wiel's split-up algorithm) are implemented in the R package
coin
for conditional inference. See:Hothorn T, Hornik K, Van de Wiel MA, Zeileis A (2006). "A Lego System for Conditional Inference". The American Statistician, 60(3), 257-263.
Hothorn T, Hornik K, Van de Wiel MA, Zeileis A (2008). "Implementing a Class of Permutation Tests: The coin Package." Journal of Statistical Software, 28(8), 1-23.
The R code for the Streitberg-Röhmel algorithm is contained in the file
coin/R/ExactDistributions.R
in thecoin
source package, available from CRAN.