Solved – R: coin::wilcoxsign_test() distribution = “exact”

rwilcoxon-signed-rank

In a repeated measure design, I have two measurements for one variable for N=5 subjects. I am interested in assessing whether the distribution of the differences between the two measurements is symmetrical around zero. For this reason, I am performing a Wilcoxon Signed-Rank test with wilcox.test() in R.

However, due to a zero in my differences, I am warned in R that I cannot get an exact p-value. The warning I get is:

Warning message: In wilcox.test.default(x, y, paired = TRUE) : cannot compute exact p-value with zeroes

More specifically, what I am understanding is that a continuity correction will be applied to my p-value since my T statistic will be compared against a normal approximation of the T distribution. Am I understanding this correctly? I am understanding that similarly this happens also when there are ties in the difference values.

Moreover, due to the low numerosity of my sample, I understand that approximating the T distribution to a normal one is not the ideal solution, so I am shifting to the wilcoxsign_test() function from the coin package. By using the distribution = "exact" argument, it seems to me that the wilcoxsign_test() function will compare the T statistic computed on my data against the T distribution computed by permuting all my data. Is this correct?
Moreover, handling of ties (and zeroes?) will be carried out according to the Pratt (1959) method (default). Am I correct?

Finally, does the distribution = "asymptotic" argument correspond to perform a normal approximation?

Best Answer

There are several questions here. I will attempt to assemble some the comments and my own thoughts into an answer. A caveat here is that I'm not entirely familiar with the programming of these functions, so there may be misconceptions in the following answer.

Note that this question pertains to the paired or one-sample tests.

Warning. The warning message from wilcox.test doesn't imply that the test results are invalid. It merely means what it says: The test can't compute an exact p-value when there are zero differences. Instead it will remove the zeros and compute the p-value by asymptotic approximation. The following two calls give the same result:

wilcox.test(c(0,1,2,3,4), exact=F)

wilcox.test(c(1,2,3,4), exact=F

Continuity correction: The continuity correction is applied only when the p-value is computed by asymptotic approximation. So the following give the same result:

wilcox.test(c(1,2,3,4), exact=T, correct=F)

wilcox.test(c(1,2,3,4), exact=T, correct=T)

but the following give different results:

wilcox.test(c(1,2,3,4), exact=F, correct=F)

wilcox.test(c(1,2,3,4), exact=F, correct=T)

I'm pretty sure the wilcoxsign_test function never applies the continuity correction.

Handling of zero differences: As mentioned above, when there are zero differences, the wilcox.test function removes those zeros, and uses the asymptotic approximation for the p-value.

The wilcoxsign_test function can compute an "exact" p-value when there are zero differences. For example,

wilcoxsign_test(c(0,0,0,0,0)~c(0,1,2,3,4), distribution="exact").

The wilcoxsign_test function can handle zero differences in two ways, so that the following give different results: Pratt and Wilcoxon:

wilcoxsign_test(c(0,0,0,0,0)~c(0,1,2,3,4), distribution="approximate", zero.method="Pratt")

wilcoxsign_test(c(0,0,0,0,0)~c(0,1,2,3,4), distribution="approximate", zero.method="Wilcoxon")

But I don't think these methods differ if distribution="exact" is used. I'm not sure.

With wilcoxsign_test, I think using distribution="asymptotic" and zero.method="Wilcoxon" will result in the same handling as would wilcox.test, so that the following two give the same result:

wilcoxsign_test(c(0,0,0,0,0)~c(0,1,2,3,4), distribution="asymptotic", zero.method="Wilcoxon")

wilcox.test(c(0,1,2,3,4), exact=F, correct=F)

Related Solutions

Solved – How to analyze these data

Sometimes a formal statistical test is overkill. Row by row, the entries in the first column are the largest. Draw a picture to make this apparent: side-by-side boxplots or dotplots would work nicely.

Although this is a post-hoc comparison, if the initial intent had been to compare the first column against the rest for a shift in distribution, the most extreme characterizations would be that either all maxima or all minima occur in the first column (a two-sided test). The chance of this occurring by chance, if all columns contained values drawn at random from a common distribution, would be $2 (\frac{1}{6})^7$ = about 0.0007%.

In fact, the first two contains the largest 7 of the 42 values. Again, ex post facto, the chance of such an extreme ordering occurring equals $\frac{2}{42 \choose 7}$ = about 0.000007%.

These results indicate that any reasonably powerful test you choose to conduct will conclude there's a highly significant difference.

In any event, You don't need a p-value; you need to characterize how large the difference is (the right way to do this depends on what the data mean) and you need to seek an explanation for the difference.

Solved – ks.test and ks.boot – exact p-values and ties

There are two points that are confused. The first one is about words "exact" and "approximate" in a statistical context. The word "exact" means that while calculations are carried out, no simplifications are used. The "approximate" p-value does not mean that the value is rounded to some precision. It means that while calculating it, some simplifications have been used. However, both "exact" and "approximate" calculations give precise numerical values. It is only our confidence that may differ. And now the second point: it is just the way of formatting output that gives you non-precise values. Actually, you are invoking the same output in different ways.

ks.test (black, red, alternative="l")$p.value
ks.test (black, red, alternative="g")$p.value
ks.test (black, red)$p.value

all give you precise (not rounded) values because you are calling the value of variables. In the last case the p-value is so small that it is lower than machine precision, and thus is listed as 0. But, when you are just calling a function, the function gives you human-readable output. During preparing this output, the p-values are passing through the format.pval() function. First of all, check the consistency of ks.test (black, red) and ks.test (black, red, alternative="g") - the p-values are the same in the non-precise format. And now compare

ks.test (black, red, alternative="g")$p.value and

format.pval(ks.test (black, red, alternative="g")$p.value)

Now is it clear how that p-value < 2.2e-16 is produced?

And finally about ks.boot(). It uses bootstrapping. While ks.test() obtains the probability of test statistics from the Kolmogorov distribution (this distribution describes how test statistics are distributed when two samples really are drawn from the same distribution), ks.boot() obtains the probability of test statistics from an empirical distribution, derived under the null hypothesis. That is, the studied two samples are combined together and from this united set two new samples are drawn at random with replacement. These new samples for sure are drawn from the same distribution, and their test statistics is noted. Repeating such procedure many times, we obtain the empirical distribution of test statistics under the null hypothesis. The number of repeats you are doing is in nboots variable in ks.boot() output. You have used default value of 1000. In this way, you have simulated 1000 test statistics values under the null hypothesis. You actual test statistics is greater than all these 1000. That means that p-value at least is equal or lesser than 0.001 - that is ks.boot.p.value. Call ks.boot(red,black,nboots=10000) and you'll obtain ks.boot.p.value=0.0001. To obtain a reasonable p-value with ks.boot() your nboots should have larger (absolute) order than expected p-value do have (i.e. more than $10^{23}$). I recommend you not do this, since it'll hang up your computer or will throw memory exception. Actually, the precise p-values of such small order have no any practical usage. Indeed, they are very sensitive to small changes in data, and thus repeated experiments would result in largely different p-values, so it can be said that the less p-value is - the less confidence to it precise value should be given.

Best Answer

Related Solutions

Solved – How to analyze these data

Solved – ks.test and ks.boot – exact p-values and ties

Related Question