Kruskal-Wallis Test – Adjusted Significance Level in SPSS for Post Hoc Testing Using Bonferroni Adjustment

bonferronidunn-testkruskal-wallis test”post-hocspss

I am making a pairwise comparison using Dunn's post hoc test with Bonferroni correction. However, I am a bit confused on the interpretation of the adjusted significance on SPSS. I am making 10 pairwise comparisons, and thus my assumption is that the adjusted significance will be 0.005 (0.05/10). However, my question is when seeing the pairwise comparisons table on SPSS do I have to take this calculation that I made into consideration when looking at the adjusted significance? This is an example:

Sample 1 – Sample 2	Sig.	Adj. Sig.
1-4	0.000	0.008
1-7	0.000	0.005
4-7	0.895	1.000

According to the table above, my understanding is that observations 1-4 and 4-7 are not significant different as the adj. sig is higher than 0.005 (what I calculated)? Am I wrong?
Thank you in advance.

Best Answer

"Adj. sig" looks like "q values," where instead of adjusting the rejection criterion $\alpha$ by dividing it by the number of comparisons, they multiple the p value by the number of comparisons. (You get incoherent gibberish when doing so, because you end up with "probabilities" greater than 1, but since you would be very far from rejecting with the unadjusted p values anyway, this is tolerated in practice.)

To recap: compare unadjusted p values to $\frac{\alpha}{10}$. This will give the same rejection decisions as comparing adjusted p values of $\alpha$.

Related Solutions

statistical-significance – Strange Result in Post-Hoc Test: Causes and Solutions

Think of it this way - overall, there's a significant difference, but it's a little hard to say exactly which two are significantly different. Alternatively, consider the chances of having three p-values less than 0.1 (even though they aren't independent of each other) - pretty small, right? So, again overall, we might suspect something significant is in the data, without being able to tell exactly where.

Your small sample sizes don't help; they mean the powers of your tests are very low, and also severely constrain what sort of p-values you can get, as the following example shows:

> g1a <- rnorm(3,0,1)
> g2a <- rnorm(3,2.5,1)
> g3a <- rnorm(3,5,1)
> 
> y <- list(g1a,g2a,g3a)
> y
[[1]]
[1] -2.31356435 -0.09903136 -0.42037052

[[2]]
[1] 2.806082 2.799857 3.383844

[[3]]
[1] 6.543636 6.845559 4.838341

> kruskal.test(y)

    Kruskal-Wallis rank sum test

data:  y 
Kruskal-Wallis chi-squared = 7.2, df = 2, p-value = 0.02732

So far, so good. On to the three Wilcoxon tests:

> wilcox.test(g1a,g2a,paired=FALSE,exact=TRUE)

    Wilcoxon rank sum test

data:  g1a and g2a 
W = 0, p-value = 0.1
alternative hypothesis: true location shift is not equal to 0 

> wilcox.test(g2a,g3a,paired=FALSE,exact=TRUE)

    Wilcoxon rank sum test

data:  g2a and g3a 
W = 0, p-value = 0.1
alternative hypothesis: true location shift is not equal to 0 

> wilcox.test(g1a,g3a,paired=FALSE,exact=TRUE)

    Wilcoxon rank sum test

data:  g1a and g3a 
W = 0, p-value = 0.1
alternative hypothesis: true location shift is not equal to 0

All three p-values at 0.1, but we can't get more extreme - W = 0 - so evidently we've hit a sample size imposed limit on p-values.

Solved – Post hoc test in a 2×3 mixed design ANOVA using SPSS

Answer edited to implement encouraging and constructive comment by @Ferdi

I would like to:

provide an answer with a full contained script
mention one can also test more general custom contrasts using the /TEST command
argue this is necessary in some cases (ie the EMMEANS COMPARE combination is not enough)

I assume to have a database with columns: depV, Group, F1, F2. I implement a 2x2x2 mixed design ANOVA where depV is the dependent variable, F1 and F2 are within subject factors and Group is a between subject factor. I further assume the F test has revealed that the interaction Group*F2 is significant. I therefore need to use post hoc t-tests to understand what drives the interaction.

MIXED depV BY Group F1 F2 
  /FIXED=Group F1 F2 Group*F1 Group*F2 F1*F2 Group*F1*F2 |  SSTYPE(3) 
  /METHOD=REML 
  /RANDOM=INTERCEPT | SUBJECT(Subject) COVTYPE(VC) 
  /EMMEANS=TABLES(Group*F2) COMPARE(Group) ADJ(Bonferroni)
  /TEST(0) = 'depV(F2=1)-depV(F2=0) differs between groups' 
    Group*F2 1/4 -1/4 -1/4 1/4 
    Group*F1*F2 1/8 -1/8 1/8 -1/8 -1/8 1/8 -1/8 1/8 
  /TEST(0) = 'depV(Group1, F2=1)-depV(Group2, F2=1)' Group 1 -1
    Group*F1 1/2 1/2 -1/2 -1/2 
    Group*F2 1 0 -1 0  
    Group*F1*F2 1/2 0 1/2 0 -1/2 0 -1/2 0 .

In particular the second t-test corresponds to the one performed by the EMMEANS command. The EMMEANS comparison could reveal for example that depV was bigger in Group 1 on the condition F2=1.

However the interaction could also be driven by something else, which is verified by the first test: the difference depV(F2=1)-depV(F2=0) differs between groups, and this is a contrast you cannot verify with the EMMEANS command (at least I did not find an easy way).

Now, in models with many factors it is a bit tricky to write down the /TEST line, the sequence of 1/2, 1/4 etc, called L matrix. Typically if you get the error message: "the L matrix is not estimable", you are forgetting some elements. One link that explains the receipt is this one: https://stats.idre.ucla.edu/spss/faq/how-can-i-test-contrasts-and-interaction-contrasts-in-a-mixed-model/

Best Answer

Related Solutions

statistical-significance – Strange Result in Post-Hoc Test: Causes and Solutions

Solved – Post hoc test in a 2×3 mixed design ANOVA using SPSS

Related Question