Solved – Correction for multiple comparisons

anova

I have a 2x3x4 repeated measured anova. I have a significant 3-way interaction, and I want to make sure that I am using the correct post hoc comparisons and not violating any key statistics theory.

I have run the statistics in SPSS and have adjusted for multiple comparisons using SIDAK but I want to make sure I understand how many adjustments are being made and if the p values are being corrected appropriately.

My post hoc tests have analysed the following:

AxB at each level of C

AxC at each level of B

BXC at each level of A

I am just trying to determine what correction factor is appropriate. For example b x c at each level of A compares 2 means 12 times. Here am I not adjusting because it is 2 means, or I am dividing the alpha value by 12 as there are 12 comparisons.

Similarly for a x c at each level of B I am comparing 3 means (as b has 3 levels) at each combination of a x c (8 combinations). Here would I be dividing the alpha value by 3 or by the combinations x the different means hence 24).

Similarly, for axb at each level of c am comparing 4 means (5 comparisons total) at each combination of a x b (6 total).Here would I be dividing the alpha value by 5 or by the combinations x the different means hence 30.

Alternatively would i be dividing the alpha value by the sum of all the above combinations?

Thanks

Best Answer

I'll be the bunny who points out that not every set of scientific questions requires a 'correction' for multiplicity of tests and not every approach to statistical inference involves keeping track of type I errors. (Prepares self for down-votes!)

You wish to be sure that you are "not violating any key statistics theory", but if you make adjustments to p-values (or critical thresholds) then you can be assured that you will be violating the likelihood principle in order to comply with the repeated sampling principle. If you wish to behave as a pure frequentist then comply with the repeated sampling principle at all costs: adjust away. However, if you wish to deal directly with the evidence in your data then you cannot be a pure frequentist because you have to comply with the likelihood principle.

If you are interested in the evidence then it is helpful to know that the evidence itself is unaffected by multiplicity of testing (and by stopping rules). Of course if you wish to make a decision on the basis of the evidence it is perfectly OK to take multiplicity into account.

For your particular application I would imagine that you might be able to answer your substantive question using a hierarchical model instead of a bunch of frequentist error rate-adjusted hypothesis tests. (Most Bayesian methods comply with the likelihood principle.) Here's an example that might help you see the bigger picture: http://www.stat.columbia.edu/~gelman/research/published/multiple2f.pdf