Wilcoxon Signed Rank – Why Zero Differences Don’t Enter Computation in the Wilcoxon Signed Rank Test

hypothesis testingpaired-comparisonswilcoxon-signed-rank

The Wilcoxon signed ranked test tells us if the median difference between paired data can be zero. The test is executed by computing a statistic, then a z-score and comparing it to a critical value.

The thing that I find shocking is that we

discard all the pairs with same values from the process of computing the statistic.

From Wikipedia we have in step2:

Exclude pairs with $|x_{2,i} – x_{1,i}| = 0$. Let $N_r$ be the reduced
sample size.

And only $N_r$ is used in the rest of the computation.

One of the sources cited says:

In most applications of the Wilcoxon procedure, the cases in which
there is zero difference between $X_A$ and $X_B$ are at this point
eliminated from consideration, since they provide no useful
information, and the remaining absolute differences are then ranked
from lowest to highest, with tied ranks included where appropriate.

The author then proceeds to compute in the same manner as in the Wikipedia article.

I tried to look at the original Wilcoxon's article, but he does not seem to mention same value pairs.

The reason why I think this is madness is:

Ok, same value pairs do not change the value of the statistic, but they change the z-score. Imagine having a sample of $10^{1000}$ pairs while in $10$ pairs, the second value is higher and in all the remaining pairs, the values are the same. According to the above mentioned articles, we should discard these $10^{1000}-10$ pairs since they "provide no useful information" and consider only the remaining $10$ pairs. But those $10^{1000} – 10$ pairs do provide useful information. They scream in favor of the null hypothesis.

Please, could you explain how to do the test right?

Best Answer

It has to do with the assumptions of the test for which the distribution of the test statistic under the null is derived.

The variables are assumed to be continuous.

The probability of a tie is therefore 0 ... and this makes it possible to compute the permutation distribution of the test statistic under the null for given sample size.

Without that assumption being true, you could still do a test, but if you're going to get the null distribution of the test statistic, you'll have to try to compute it conditional on the pattern of tied values (or more easily, simulate).

The easier alternative is to only consider untied values.

Note further that observing ties is not 'evidence in favor of the null', it only contains a lack of evidence against it. With discrete distributions, a range of non-null alternatives are likely to produce ties, not just the null itself.

The 'correct' thing to do is not use a test that assumes continuous distributions on data that don't satisfy the assumptions. If you don't have that, you have to do something to deal with that failure.

I believe that conditioning on the untied data preserves the required properties for the significance level in a way that including ties in some way would not. We might check by simulation.

Related Question