First, wilcoxon test in scipy.stats
does NOT use $W$ as the test statics, it instead uses $T$ as defined in Siegel's popular book: Non-parametric statistics for the behavioral sciences. And yes, as @whuber correctly pointed out, once you know $T$ and sample size, $W$ is also defined (@whuber, strictly speaking, not quite, one also need to know how 0 differences are handled).
Only can only know how the test is implemented by reading the source code. For scipy
, Wilcoxon test can be found in your_python_package_folde/scipy/stats/morestats.py
. Compare to R
's wilcox.test
, it is very simple. Go over the code, and you will see that it is equivalent to having correct=FALSE, exact=FALSE, paired=TRUE
flags on in R
.
Python:
>>> from scipy import stats
>>> x1=[48, 7, 12, 11, 62, 93, 79, 53, 28, 49, 74, 59, 57, 62, 22, 8, 30, 11, 2, 47]
>>> x2=[20, 13, 41, 61, 93, 11, 28, 61, 26, 91, 95, 5, 80, 45, 88, 99, 50, 96, 69, 93]
>>> stats.wilcoxon(x1, x2) # T and p value, two-sided
(60.0, 0.092963126712486244)
in R
:
> x1<-c(48, 7, 12, 11, 62, 93, 79, 53, 28, 49, 74, 59, 57, 62, 22, 8, 30, 11, 2, 47)
> x2<-c(20, 13, 41, 61, 93, 11, 28, 61, 26, 91, 95, 5, 80, 45, 88, 99, 50, 96, 69, 93)
> wilcox.test(x1,x2,correct=FALSE,exact=FALSE,paired=TRUE)
Wilcoxon signed rank test
data: x1 and x2
V = 60, p-value = 0.09296
alternative hypothesis: true location shift is not equal to 0
Best Answer
For clarity: The sample Walsh averages are the pairwise averages $(x_i+x_j)/2$, $i=1,2,...n,$ $j=1,...,i$. The median of the Walsh averages is the Hodges-Lehmann estimator, also called the the pseudo-median.
Here's some (hopefully useful) hints to get you started -- which is basically one of several ways to just arrange the calculation methodically, but it makes it easier to see the connections between the two calculations.
Here the observations are from a single sample (in the case of a paired test the observations are the pair-differences, at which point we're dealing with a single sample of pair-differences). Further assume that the $X$'s are continuous (so for example, there are no tied ranks and no Walsh-averages at 0)
Consider without loss of generality that we're comparing to a specified median of zero.
Let $X_i = S_i M_i$ where $M_i=|X_i|$ and $S_i = \mathop{\mathrm{sgn}}(X_i)$ (and similarly for $j$, when we deal with pairs of observations).
Then let $R_i = \mathop{\mathrm{rank}}(M_i)$. We now have some notation in place to describe the basic components of the signed rank test.
Now order the $X$-values from smallest magnitude to largest magnitude (i.e. sort them by the $M$-values).
(It helps to play with a small numerical example. Consider data values $1.0, -2.4, 3.6$ say -- these have deliberately been ordered as just described)
Write an $n \times n$ table with row and column headings being the ordered $X_i$ values (you may like to write a small $(S_i,R_i)$ under the columns).
Inside the table, for values on or above the main diagonal, put a "+" if the Walsh average of the corresponding row- and column- $X$-values is above 0 and "-" if it's below it. For each column, count how many "+" values there are.
Note that looking down a column we're just seeing $X_i$ compared with each other value that is no larger in magnitude than $X_i$ (i.e. with each observation to it's left in the ordered list, plus itself).
For the column labelled $X_i$, the count will either be positive or $0$. What determines which of the two things it is? Now note the connection between the column "+" count and $R_i$.
Hopefully you should be able to work your way to an argument from that.
Here's the example I mentioned above. Note that if X's are 1.0, -2.4, 3.6 then the signed ranks are +1, -2 and +3:
It should be clear that if $S_i$ is $-1$ then there are no "+" terms in the column of positive Walsh-averages, but if $S_i=+1$ then there are $R_i$ positive Walsh averages. What you need to do is make this observation a bit more formal and then argue a small step to the needed result from that.