Solved – Anova repeated measures is significant, but all the multiple comparisons with Bonferroni correction are not

anovabonferronipost-hocrepeated measuresstatistical significance

I counducted a mixed model ANOVA with time as the within subjects variable (3 levels) and group as the between subjects variable (2 levels).
All is significant (time, group, and timeXgroup).
To test if differences over the three times are significant in both groups, I conducted two separate repeated measures ANOVAs, one for each group, with time (3 levels) as the withing subjects variable.

I have a problem with the control group, because ANOVA for this group is slightly significant (p = 0.044) but the multiple comparisons with Bonferroni correction are all not significant… Indeed, in the graph I can see that the differences across time are not very big. However I don't know how to explain this result…
Can you help me?


Edit:
I try to explain myself better so maybe you can understand my situation. These are the descriptive statistics:

Control group (N=10)

Mean (SD)
1st measure 2.40 (1.36)
2nd measure 2.87 (1.57)
3rd measure 2.55 (1.48)


Experimental group (N=10)

Mean (SD)
1st measure 3.04 (1.14)
2nd measure 6.71 (3.52)
3rd measure 4.64 (1.76)

As you can see the control group has a small change over time, the experimental group instead has a big change. I ran the mixed 2×3 ANOVA to see if the two groups were significantly different and if there was an interaction. Since all was significant, I wanted to see if both group differ significantly over time or only the experimental group.
Thank you!

Best Answer

Even without Bonferroni corrections ANOVA's do not guarantee any two means are different. For example, in a statistically decisive ANOVA result could come from two pairs of means are different from each other while no individual mean comparison is significant.

Consider why you run an ANOVA. You do it because if you did all of the comparisons with a categorical predictor value then you'd run into a multiple comparisons problem. But then you go and do many of the comparisons... why? The ANOVA means that the pattern of data you see is meaningful. Describe the pattern of data, both in a figure and text, and convey what your data mean. If you really wanted to run all of the multiple comparisons then running the ANOVA was pointless. Also, keep in mind that "all of the comparisons" does not mean just those comparisons between individual means but all of the patterns patterns and combinations you could test, the ANOVA is sensitive to them too.

In your particular case, what you would do is write something like the following. There was a main effect of group, with higher scores in the experimental group and a main effect of time with the first time the lowest score, followed by the last time and finally the highest score was at the intermediate time. However, each of these main effects was qualified by an interaction. The effect of time depends on which group you are in, being greater in the experimental than the control group.

That's what your ANOVA and summary statistics say. Unless there's something more than that you want to say there's no point in running comparisons.

ASIDE: While the following is important, I consider it an aside because the primary question here is interpreting your ANOVA. Your experimental group time 2 variance is so much higher than the others that you're violating assumptions of the ANOVA. You could run simulations to see how much that affects alpha or power in your case. I did a quick one and it shows alpha is generally about 0.06 (if you select 0.05) for each test, sample code below:

nsamp <- 2000
n <- 10
sds <- rep(c(1.36, 1.57, 1.48, 1.14, 3.52, 1.78), n)
x1 <- factor(rep(1:2, times = n, each = 3))
x2 <- factor(rep(1:3, 2*n))

Y <- replicate(nsamp, {
    y <- rnorm(6 * n, 0, sds)
    #y <- rnorm(6*n) # comment out the line above and comment in this one to see what would happen if variances were equal
    m <- aov(y ~ x1 * x2)
    sm <- summary(m)
    ps <- sm[[1]]$'Pr(>F)'
    ps
    #min(ps, na.rm = TRUE)
})

sum(Y[1,] < 0.05)/nsamp
sum(Y[2,] < 0.05)/nsamp
sum(Y[3,] < 0.05)/nsamp