Solved – Post-hoc tests after Kruskal-Wallis: Dunn’s test or Bonferroni corrected Mann-Whitney tests

dunn-testhypothesis testingkruskal-wallis test”post-hocwilcoxon-mann-whitney-test

I have some non-Gaussian distributed variable and I need to check if there are significant differences between the values of this variable in 5 different groups.

I have performed Kruskal-Wallis one-way analysis of variance (which came up significant) and after that I had to check which groups were significantly different. Since the groups are kind of sorted (the values of the variable in the first group are supposed to be lower than the values of the variable in the second group which are supposed to be lower than the values of the variable in the third group and so on) I only performed 4 tests:

Group 1 vs Group 2
Group 2 vs Group 3
Group 3 vs Group 4
Group 4 vs Group 5

I have performed this analysis with two different methods. I started by using Dunn's Multiple Comparison Test but nothing came up significant. On the other hand if I use Mann-Whitney test and correct for the number of tests (4) using Bonferroni, 3 tests come up significant.

What does it mean? Which results should I trust?

Best Answer

You should use a proper post hoc pairwise test like Dunn's test.*

If one proceeds by moving from a rejection of Kruskal-Wallis to performing ordinary pair-wise rank sum tests (with or without multiple comparison adjustments), one runs into two problems:

  1. the ranks that the pair-wise rank sum tests use are not the ranks used by the Kruskal-Wallis test (i.e. you are, in effect, pretending to perform post hoc tests, but are actually using different data than was used in the Kruskal-Wallis test to do so); and

  2. Dunn's test preserves a pooled variance for the tests implied by the Kruskal-Wallis null hypothesis.

Of course, as with any omnibus test (e.g., ANOVA, Cochran's $Q$, etc.), post hoc tests following rejection of a Kruskal-Wallis test which have been adjusted for multiple comparisons may fail to reject all pairwise tests for a given family-wise error rate or given false discovery rate corresponding to a given $\alpha$ for the omnibus test.


* Dunn's test is implemented in Stata in the dunntest package (within Stata type net describe dunntest, from(https://alexisdinno.com/stata)), and in R in the dunn.test package. Caveat: there are a few less well-known post hoc pair-wise tests to follow a rejected Kruskal-Wallis, including Conover-Iman (like Dunn, but based on the t distribution, rather than the z distribution, and strictly more powerful as a post hoc test) which is implemented for Stata in the conovertest package (within Stata type net describe conovertest, from(https://alexisdinno.com/stata)), and for R in the conover.test package, and the Dwass-Steel-Critchlow-Fligner tests.