Solved – A mixed ANOVA showed no significant interaction, but a dep. t-test showed a significant effect for only one of the groups

anovainteraction

I have come across something that has got me kind of confused, I hope someone can help me out.

I did a mixed ANOVA with one within subjects factor with two levels 'time' and one between subjects factor 'group'. It was a comparison between a positive group and a negative group on scores pretraining and posttraining.

The results showed a significant main effect but no significant interaction effect. I should have stopped there maybe, but I did separate dependent t-tests for each group. These indicated that the scores of one group had changed significantly but the scores of the other group had not.

So now I don't understand anymore. Wouldn't a significant interaction effect indicate that there was a change in scores and that this change was affected by group (that the decrease in scores was significantly different between the groups)? Since there was no significant interaction shouldn't I find similar changes in both groups (either both significant or both not)?

Well, I hope this turns out to be easier than it seems to me right now, merci d'avance.

Best Answer

It is easier than it seems. Significance is not a cutoff - it's not that things are significant, so there's a difference, or there isn't. Significantly different does not mean "different" and "not significantly different" does not mean "not different".

Imagine a succession of values: B is higher, but not significantly higher than A, C is higher, but not significantly higher than B, etc up to K is higher, but not significantly higher than J. But K is significantly higher than A. That makes sense, as each difference is small - you have the same issue.

This is (part of) the message of the paper by Gelman and Stern, called The Difference Between “Significant” and “Not Significant” is not Itself Statistically Significant, which you can find here: http://www.stat.columbia.edu/~gelman/research/published/signif4.pdf .

Related Solutions

Solved – Anova repeated measures is significant, but all the multiple comparisons with Bonferroni correction are not

Even without Bonferroni corrections ANOVA's do not guarantee any two means are different. For example, in a statistically decisive ANOVA result could come from two pairs of means are different from each other while no individual mean comparison is significant.

Consider why you run an ANOVA. You do it because if you did all of the comparisons with a categorical predictor value then you'd run into a multiple comparisons problem. But then you go and do many of the comparisons... why? The ANOVA means that the pattern of data you see is meaningful. Describe the pattern of data, both in a figure and text, and convey what your data mean. If you really wanted to run all of the multiple comparisons then running the ANOVA was pointless. Also, keep in mind that "all of the comparisons" does not mean just those comparisons between individual means but all of the patterns patterns and combinations you could test, the ANOVA is sensitive to them too.

In your particular case, what you would do is write something like the following. There was a main effect of group, with higher scores in the experimental group and a main effect of time with the first time the lowest score, followed by the last time and finally the highest score was at the intermediate time. However, each of these main effects was qualified by an interaction. The effect of time depends on which group you are in, being greater in the experimental than the control group.

That's what your ANOVA and summary statistics say. Unless there's something more than that you want to say there's no point in running comparisons.

ASIDE: While the following is important, I consider it an aside because the primary question here is interpreting your ANOVA. Your experimental group time 2 variance is so much higher than the others that you're violating assumptions of the ANOVA. You could run simulations to see how much that affects alpha or power in your case. I did a quick one and it shows alpha is generally about 0.06 (if you select 0.05) for each test, sample code below:

nsamp <- 2000
n <- 10
sds <- rep(c(1.36, 1.57, 1.48, 1.14, 3.52, 1.78), n)
x1 <- factor(rep(1:2, times = n, each = 3))
x2 <- factor(rep(1:3, 2*n))

Y <- replicate(nsamp, {
    y <- rnorm(6 * n, 0, sds)
    #y <- rnorm(6*n) # comment out the line above and comment in this one to see what would happen if variances were equal
    m <- aov(y ~ x1 * x2)
    sm <- summary(m)
    ps <- sm[[1]]$'Pr(>F)'
    ps
    #min(ps, na.rm = TRUE)
})

sum(Y[1,] < 0.05)/nsamp
sum(Y[2,] < 0.05)/nsamp
sum(Y[3,] < 0.05)/nsamp

Solved – Reporting interaction effect for a 2×3 ANOVA

I like your tutor, but many people will not. The question of effect size (which shows up on a graph) vs. statistical significance (which you get as part of the output from a test) has been discussed a lot; I am strongly on the effect size side, but others differ (and not completely unreasonably, either).

However, it is often nice to be able to put some precise numbers in the text and it is hard to get those from a graph alone.

Best Answer

Related Solutions

Solved – Anova repeated measures is significant, but all the multiple comparisons with Bonferroni correction are not

Solved – Reporting interaction effect for a 2×3 ANOVA

Related Question