Solved – In R, why are results different for 1-tailed wilcoxsign_test, wilcox_test, and wilcox.exact with paired data

nonparametricrwilcoxon-signed-rank

I am trying to determine why I am getting different results with these different Wilcoxson test implementations in R version 2.15.2. I have paired data with some ties. I have read the response for What is the difference between wilcox.test and wilcox_test in R, but some of the p-values are extremely different in my data. Also, I do not understand why the coin package would have two functions (wilcoxsign_test and wilcox_test) to produce the same result. What am I missing?

Here is my example data:

library('exactRankTests') # for wilcox.exact
library('coin')

data <- data.frame(
    y = c(2770.00, 3160.00, 4120.00, 4510.00, 3320.00, 3170.00, 3340.00, 3810.00, 3760.00, 6350.00, 2720.00, 3740.00, 5210.00, 3330.00, 4230.00, 3490.00, 3138.07, 2630.88, 4058.70, 3521.11, 4941.09, 5762.31, 3565.89, 3517.91, 3413.32, 3415.98, 3439.96, 2602.11, 2659.36, 3099.79, 2820.00, 2830.00, 4310.00, 4010.00, 2780.00, 2730.00, 3130.00, 2700.00, 3510.00, 3460.00, 2470.00, 2920.00, 3230.00, 3370.00, 3290.00, 2380.00, 2845.69, 2137.58, 3477.96, 3128.84, 3117.77, 4949.78, 3061.60, 2942.57, 3149.46, 3067.10, 3164.60, 2135.01, 2275.32, 3154.66),
    condition = factor(c(rep('A', 30), rep('B', 30))),
    participant = factor(rep(1:30, times=2))
)

When trying to get exact results I compared wilcoxsign_test, wilcox_test, and the depreciated wilcox.exact.

wilcoxsign_test(y ~ condition | participant, data=data, alternative='less', 
    distribution = "exact")
##  Exact Wilcoxon-Signed-Rank Test
## 
## data:  y by x (neg, pos) 
##   stratified by block 
## Z = -4.577, p-value = 3.818e-08
## alternative hypothesis: true mu is less than 0

wilcox_test(y ~ condition | participant, data=data, alternative='greater', 
    distribution = "exact")
##  Exact Wilcoxon Mann-Whitney Rank Sum Test
## 
## data:  y by
##   condition (A, B) 
##   stratified by participant 
## Z = 4.15, p-value = 0.05913
## alternative hypothesis: true mu is greater than 0

wilcox.exact(y ~ condition, data=data, alternative='greater', exact=TRUE, 
    paired = TRUE)
##  Exact Wilcoxon signed rank test
## 
## data:  y by condition 
## V = 455, p-value = 3.818e-08
## alternative hypothesis: true mu is greater than 0

Even the approximate methods sometimes differ as seen below.

wilcox_test(y ~ condition | participant, data = data, alternative = "greater", 
    distribution = "approximate")
## 
##  Approximative Wilcoxon Mann-Whitney Rank Sum Test
## 
## data:  y by
##   condition (A, B) 
##   stratified by participant 
## Z = 4.15, p-value < 2.2e-16
## alternative hypothesis: true mu is greater than 0
wilcoxsign_test(y ~ condition | participant, data = data, alternative = "less", 
    distribution = "approximate")
##  Approximative Wilcoxon-Signed-Rank Test
## 
## data:  y by x (neg, pos) 
##   stratified by block 
## Z = -4.577, p-value < 2.2e-16
## alternative hypothesis: true mu is less than 0
wilcox.exact(y ~ condition, data = data, alternative = "greater", exact = FALSE, 
    paired = TRUE)
##  Asymptotic Wilcoxon signed rank test
## 
## data:  y by condition 
## V = 455, p-value = 2.362e-06
## alternative hypothesis: true mu is greater than 0
wilcox.test(y ~ condition, data = data, alternative = "greater", paired = TRUE)
## Warning: cannot compute exact p-value with ties
## 
##  Wilcoxon signed rank test with continuity correction
## 
## data:  y by condition 
## V = 455, p-value = 2.481e-06
## alternative hypothesis: true location shift is greater than 0
pairwise.wilcox.test(data$y, data$condition, p.adj = "none", paired = TRUE)
## Warning: cannot compute exact p-value with ties
## 
##  Pairwise comparisons using Wilcoxon signed rank test 
## 
## data:  data$y and data$condition 
## 
##   A    
## B 5e-06
## 
## P value adjustment method: none

As a related question. How can I get the extract observed Wilcoxon statistic for wilcoxsign_test? I know you can do the following for wilcox_test, but it does not seem to work for wilcoxsign_test.

w_t <- wilcox_test(y ~ condition | participant, data = data, alternative = "greater", 
distribution = "exact")
statistic(w_t, "linear")
##       
## A 1128

Best Answer

Your call for this test:

wilcox_test(y ~ condition | participant, data = data, alternative = "greater", 
   distribution = "approximate")

produces not a signed rank test (as the other calls do) but a rank sum test. It's a different thing for a different situation (independent samples, not paired). It tells you this in the output:

## Approximative Wilcoxon Mann-Whitney Rank Sum Test

So there's no way that should give a similar p-value to the other cases.

from your comment:

which indicates that it should handle handle pairwise comparisons as well.

It's possible that's a correct conclusion, but - whatever is needed to make it do that - you clearly didn't achieve it, as the output made clear: you didn't get a signed rank test.

The two exact signed rank tests did produce the same p-value, as one would hope.

With extremely small p-values you should not expect approximate methods to be highly accurate (close to the exact test values). They do all lead one to reject at any sensible significance level, which is about as much as you can ask as far as consistency goes.

As for differences between them, you'd have to look to exactly what has been implemented for each test - what approximations are made, what assumptions, and so on.

The last p-value (the pairwise comparison) doesn't seem to be a one-tailed test so it's hardly surprising it's about twice as large as the two above it.