Indeed an omnibus test is not strictly needed in that particular scenario and multiple inference procedures like Bonferroni or Bonferroni-Holm are not limited to an ANOVA/mean comparison settings. They are often presented as post-hoc tests in textbooks or associated with ANOVA in statistical software but if you look up papers on the topic (e.g. Holm, 1979), you will find out that they were originally discussed in a much broader context and you certainly can “skip the ANOVA” if you wish.
One reason people still run ANOVAs is that pairwise comparisons with something like a Bonferroni adjustment have lower power (sometimes much lower). Tukey HSD and the omnibus test can have higher power and even if the pairwise comparisons do not reveal anything, the ANOVA F-test is already a result. If you work with small and haphazardly defined samples and are just looking for some publishable p-value, as many people are, this makes it attractive even if you always intended to do pairwise comparisons as well.
Also, if you really care about any possible difference (as opposed to specific pairwise comparisons or knowing which means differ), then the ANOVA omnibus test is really the test you want. Similarly, multi-way ANOVA procedures conveniently provide tests of main effects and interactions that can be more directly interesting than a bunch of pairwise comparisons (planned contrasts can address the same kind of questions but are more complicated to set up). In psychology for example, omnibus tests are often thought of as the main results of an experiment, with multiple comparisons only regarded as adjuncts.
Finally, many people are happy with this routine (ANOVA followed by post-hoc tests) and simply don't know that the Bonferroni inequalities are very general results that have nothing to do with ANOVA, that you can also run more focused planned comparisons or do a whole lot of things beside performing tests. It's certainly not easy to realize this if you are working from some of the most popular “cookbooks” in applied disciplines and that explains many common practices (even if it does not quite justify them).
Holm, S. (1979). A simple sequentially rejective multiple test procedure. Scandinavian Journal of Statistics, 6 (2), 65–70.
No, you cannot conclude that none of the comparisons are significant. If the figure represents your data then I'm guessing that you're going after present higher than absent in gain but not in loss. If that's the case be very cautious because that's exactly what your interaction tested. And, a non significant interaction already told you that the difference between significant and not significant was not, itself, significant.
You might even only find gain:present higher than loss:absent, but what would that mean? Are all of tests even sensible or interpretable?
In short, before venturing forth with more tests seriously consider what they mean. I suggest that if you really consider the ANOVA result it has already tested any meaningful questions you'll have about the data. If you decide to make the comparisons anyway then answers to this question address yours as well.
Best Answer
Let's say you have 3 different methods for improving reading scores in grade 5 children: Method 1, Method 2 and Method 3. You randomly assign 20 children to each method and measure the change in reading score achieved by these children from the start to the end of each method's administration.
If, when you set up the study and prior to collecting any data, you decide that you are only interested in seeing how Method 2 compares with Method 3, you are going to conduct a planned comparison between the mean change in reading score between Method 2 and Method 3 once the data become available. This decision should be driven by subject matter considerations. Because the planned comparison involves only two methods, it is a pairwise comparison between two mean changes in reading score.
What if you wanted to test a slightly different hypothesis when you set up the study and prior to collecting any data? Namely, that the Novel methods (i.e., Methods 2 and 3) made a difference relative to the Standard method? Then you would need to test a planned composite (or complex) hypothesis which will enable you to compare the mean change in reading score for the Standard method versus the average value of the mean changes in reading score for Methods 2 and 3.
If, when you set up the study, you don't single out any specific comparisons you are most interested in making concerning the three methods, then you can perform posthoc comparisons once you collect the data. These will consist of all pairwise comparisons between the three methods. Each comparison will enable you to compare the mean change in reading score between the two methods it considers.
Now, assume you want to conduct a slightly more complicated study, where you keep track not only of the change in reading score for each child but also their gender (male or female). You will consider a model which relates the change in reading score to Method, Gender and their interaction, since you suspect that the effect of Method might be different for male students compared to female students.
If you find a significant interaction between Method and Gender in your model after fitting it to your data, that simply means that your data provide evidence that the effect of Method is different across Genders. But you have to investigate what that means.
If you consider just the female students, you are going to have to compare the mean changes in scores for these students across the three methods to see if you find any evidence they are not all the same. (This is the same situation as you had in the simpler study when you had to perform posthoc comparisons.) So one possibility is to use post-hoc comparisons to find differences between any pairs of methods for the female students only.
If you consider just the male students, you are going to have to compare the mean changes in scores for these students across the three methods to see if you find any evidence they are not all the same. Again, one possibility is to use post-hoc comparisons to find differences between any pairs of methods for the male students only.