Think of it this way - overall, there's a significant difference, but it's a little hard to say exactly which two are significantly different. Alternatively, consider the chances of having three p-values less than 0.1 (even though they aren't independent of each other) - pretty small, right? So, again overall, we might suspect something significant is in the data, without being able to tell exactly where.
Your small sample sizes don't help; they mean the powers of your tests are very low, and also severely constrain what sort of p-values you can get, as the following example shows:
> g1a <- rnorm(3,0,1)
> g2a <- rnorm(3,2.5,1)
> g3a <- rnorm(3,5,1)
>
> y <- list(g1a,g2a,g3a)
> y
[[1]]
[1] -2.31356435 -0.09903136 -0.42037052
[[2]]
[1] 2.806082 2.799857 3.383844
[[3]]
[1] 6.543636 6.845559 4.838341
> kruskal.test(y)
Kruskal-Wallis rank sum test
data: y
Kruskal-Wallis chi-squared = 7.2, df = 2, p-value = 0.02732
So far, so good. On to the three Wilcoxon tests:
> wilcox.test(g1a,g2a,paired=FALSE,exact=TRUE)
Wilcoxon rank sum test
data: g1a and g2a
W = 0, p-value = 0.1
alternative hypothesis: true location shift is not equal to 0
> wilcox.test(g2a,g3a,paired=FALSE,exact=TRUE)
Wilcoxon rank sum test
data: g2a and g3a
W = 0, p-value = 0.1
alternative hypothesis: true location shift is not equal to 0
> wilcox.test(g1a,g3a,paired=FALSE,exact=TRUE)
Wilcoxon rank sum test
data: g1a and g3a
W = 0, p-value = 0.1
alternative hypothesis: true location shift is not equal to 0
All three p-values at 0.1, but we can't get more extreme - W = 0 - so evidently we've hit a sample size imposed limit on p-values.
Answer edited to implement encouraging and constructive comment by @Ferdi
I would like to:
- provide an answer with a full contained script
- mention one can also test more general custom contrasts using the /TEST command
- argue this is necessary in some cases (ie the EMMEANS COMPARE combination is not enough)
I assume to have a database with columns: depV, Group, F1, F2. I implement a 2x2x2 mixed design ANOVA where depV is the dependent variable, F1 and F2 are within subject factors and Group is a between subject factor. I further assume the F test has revealed that the interaction Group*F2 is significant. I therefore need to use post hoc t-tests to understand what drives the interaction.
MIXED depV BY Group F1 F2
/FIXED=Group F1 F2 Group*F1 Group*F2 F1*F2 Group*F1*F2 | SSTYPE(3)
/METHOD=REML
/RANDOM=INTERCEPT | SUBJECT(Subject) COVTYPE(VC)
/EMMEANS=TABLES(Group*F2) COMPARE(Group) ADJ(Bonferroni)
/TEST(0) = 'depV(F2=1)-depV(F2=0) differs between groups'
Group*F2 1/4 -1/4 -1/4 1/4
Group*F1*F2 1/8 -1/8 1/8 -1/8 -1/8 1/8 -1/8 1/8
/TEST(0) = 'depV(Group1, F2=1)-depV(Group2, F2=1)' Group 1 -1
Group*F1 1/2 1/2 -1/2 -1/2
Group*F2 1 0 -1 0
Group*F1*F2 1/2 0 1/2 0 -1/2 0 -1/2 0 .
In particular the second t-test corresponds to the one performed by the EMMEANS command. The EMMEANS comparison could reveal for example that depV was bigger in Group 1 on the condition F2=1.
However the interaction could also be driven by something else, which is verified by the first test: the difference depV(F2=1)-depV(F2=0) differs between groups, and this is a contrast you cannot verify with the EMMEANS command (at least I did not find an easy way).
Now, in models with many factors it is a bit tricky to write down the /TEST line, the sequence of 1/2, 1/4 etc, called L matrix. Typically if you get the error message: "the L matrix is not estimable", you are forgetting some elements. One link that explains the receipt is this one: https://stats.idre.ucla.edu/spss/faq/how-can-i-test-contrasts-and-interaction-contrasts-in-a-mixed-model/
Best Answer
"Adj. sig" looks like "q values," where instead of adjusting the rejection criterion $\alpha$ by dividing it by the number of comparisons, they multiple the p value by the number of comparisons. (You get incoherent gibberish when doing so, because you end up with "probabilities" greater than 1, but since you would be very far from rejecting with the unadjusted p values anyway, this is tolerated in practice.)
To recap: compare unadjusted p values to $\frac{\alpha}{10}$. This will give the same rejection decisions as comparing adjusted p values of $\alpha$.