Solved – Kruskal-Wallis and post-hoc analysis in R

kruskal-wallis test”r

Although I know there are several post in this forum that are about this topic, none of them was useful in my case.

I have the next data:

     V1     V2
1.62790698   1
 1.62790698  1
 7.95006570  1
 8.60709593  1
 7.82945736  2
14.18604651  2
 4.65116279  2
 3.87596899  2
 3.90930414  2
 0.39093041  2
 6.18421053  2
 2.82894737  2
15.55929352  2
 6.98065601  2
 0.07751938  3
 4.03100775  3
 4.65116279  3
 7.82945736  3
 9.18686474  3
 8.36591087  3
12.74433151  3
 1.60281470  3
 5.78947368  3
13.81578947  3
 1.57894737  3
 8.48684211  3
 6.98065601  3
 5.88730025  3
12.86795627  3
16.31623213  3

The column on the left represents the measured variable and the column on the right represents the groups. So, there are 3 different groups.

When I introduce this data into R commander, I performed Shapiro-Wilk tests and Bartlett test. Due to all the requisites that are necessary to perform an ANOVA are not accomplished, I decided to perform instead a Kruskal-Wallis test.

> kruskal.test(V1 ~ V2, data=Datos)

    Kruskal-Wallis rank sum test

data:  V1 by V2
Kruskal-Wallis chi-squared = 6.5558, df = 2, p-value = 0.03771

As you can see, there are statistical differences.

On the other hand, I thought about performing a post-hoc analysis in order to know how my three groups are grouped according to their differences. According to this, I install and charged the PMCMR library. I introduced the next code:

posthoc.kruskal.nemenyi.test(x=V1, g=V2, method="Tukey")

With the next results:

> posthoc.kruskal.nemenyi.test(x=V1, g=V2, method="Tukey")

    Pairwise comparisons using Tukey and Kramer (Nemenyi) test  
                   with Tukey-Dist approximation for independent samples 

data:  V1 and V2 

  1     2    
2 0.211 -    
3 1.000 0.098

P value adjustment method: none

However, a warning also appeared:

[50] NOTA: Aviso en posthoc.kruskal.nemenyi.test(x = V1, g = V2, method = "Tukey") :
Ties are present, p-values are not corrected.

On the other hand, when I execute:

posthoc.kruskal.nemenyi.test(x=V1, g=V2, method="Chisq")

I get the next results:

> posthoc.kruskal.nemenyi.test(x=V1, g=V2, method="Chisq")

    Pairwise comparisons using Nemenyi-test with Chi-squared    
                       approximation for independent samples 

data:  V1 and V2 

  1    2   
2 0.24 -   
3 1.00 0.12

P value adjustment method: none

This one also have a warning:

[51] NOTA: Aviso en posthoc.kruskal.nemenyi.test(x = V1, g = V2, method = "Chisq") :
Ties are present. Chi-sq was corrected for ties.

So, my questions are:

  1. If I get a Kruskal Wallis p value lower than 0.05, I would expect to have any statistical differences when obtaining pairwise comparisons, which is not the case.
  2. Is it right the way I proceed?
  3. Is there any other possibility or code (implemented in different libraries) to get to what I wanted?

Best Answer

  1. heteroskedasticity seems to be the thing you're most worried about -- why go to Kruskal Wallis rather than just a Welch adjustment? However it happens that your standard deviations are almost constant (3.9, 4.8, 4.7). Why would that very modest amount of change in spread by of concern?

  2. a rejection of the omnibus null doesn't necessarily imply any of the individual comparisons will be significant.

  3. formal hypothesis tests of assumptions aren't necessarily useful -- we don't necessarily believe any of the assumptions are exactly true, what matters is their impact on your inference, which a p-value in a hypothesis test really doesn't tell you. (You might easily reject the null of constant variance, but if the standard deviations by group don't change by a substantial amount (possibly by a good deal more than you can detect by a test, depending on sample size), it may hardly matter. On the other hand, failure to reject in small samples should be no consolation at all.

Related Question