Solved – Help me understand the z-value computed in wilcoxsign_test (R, coin package)

nonparametricpermutation-testrwilcoxon-signed-rank

I'm using the coin-package in R to run comparisons on paired data using wilcoxsign_test. I use the exact version of the test, so I assume that the test goes through all permutations of the paired data and returns an accurate p-value.

What is unclear to me is how the z-value returned by default by the function is computed. I'm aware of the usual large sample approximation formulas used to convert Wilcoxon statistics to z scores, but I wonder how the z value is derived in the case of the exact permutation version of the test.

My questions are:

  1. Are all of the w-statistics produced by the permutation enumerated so that the distribution of w is approximately normally distributed and the observed w value is then converted to a z value based on distribution yielded by the permutation?
  2. As I understand it, the p-value of the exact test is accurate and the z value is some sort of approximation that normally requires a sample size of > 10 (at least when using the standard formulas) and thus not always a very useful statistic. Does the permutation version of the test somehow get around this?
  3. The p-values returned do not always correspond to the p-values I obtain by converting the z values to p-values. Could someone clear this up for me?

Best Answer

  1. Are all of the w-statistics produced by the permutation enumerated so that the distribution of w is approximately normally distributed and the observed w value is then converted to a z value based on distribution yielded by the permutation?

I doubt it, since it's not necessary -- we can compute the mean and variance without knowing the exact distribution, even in the presence of ties. But you could only tell with absolute certainty what the code does by actually looking at it.

As I understand it, the p-value of the exact test is accurate and the z value is some sort of approximation that normally requires a sample size of > 10 (at least when using the standard formulas) and thus not always a very useful statistic.

No, the z value should be exact at any sample size (in the sense that it's a correct computation of the number of standard deviations the test statistic is from the mean under the null), but the resulting quantity doesn't have an exact normal distribution (under the null), so if you computed a p-value from that z, then the p-value would be approximate.

The p-values returned do not always correspond to the p-values I obtain by converting the z values to p-values.

Sure --

  • The test statistic isn't exactly normally distributed.

  • If they're using sampling of the permutation distribution rather than exact calculations, there's also some simulation error.