Solved – Why do planned comparisons and post-hoc tests differ

anovahypothesis testingmultiple-comparisonspost-hoctukey-hsd-test

I conducted a three-way independent ANOVA. I manipulated the sound of my stimulus to be 1: congruent (KON), 2: incongruent (INK) and 3: no sound at all (control condition). One of my planned comparisons, the incongruent (INK) vs. the congruent (KON) sound is in essence a post-hoc test, isn't it? As, in both cases, I am comparing two levels of my manipulation. Yet, the results I get differ.

PLANNED COMPARISONS

                                         Df  Sum Sq   Mean Sq  F value  Pr(>F) 
Manipulation                              2   12.34   6.170    2.473    0.0902 .
  Manipulation: control vs. Experimental  1    9.21   9.206    3.690    0.0580 .
  Manipulation: INK vs. KON               1    3.13   3.133    1.256    0.2655 
 Residuals                                88  219.54   2.495

POST-HOC TESTS

Tukey multiple comparisons of means
95% family-wise confidence level

$Manipulation
          diff       lwr        upr     p adj
KON-INK  0.4506173 -0.508031 1.40926557 0.5040074
NS-INK  -0.4960317 -1.444847 0.45278399 0.4293551
NS-KON  -0.9466490 -1.962296 0.06899754 0.0730128

Pairwise comparisons using t tests with pooled SD

    INK   KON  
KON 0.797 -    
NS  0.648 0.087

P value adjustment method: bonferroni 

So what really confuses me is that the difference between KON – INK in case of the planned comparision has a p-value of 0.2655, in case of the Tuckey HSD it has 0.5040074 and in case of the pairwise comparision with bonferroni correction, it has 0.797.

Where are these differences coming from? Or where is the flaw in my logic?

Best Answer

They aren't really the same. A planned comparison is something you are committing to before you see your data, and will run no matter what the results look like. A post-hoc comparison is more opportunistic. You look at that because, when you looked at the data, that particular comparison looked interesting. The idea here is that there will always be something that looked [most] interesting, so you need to account for that opportunism. The difference between these two approaches for the same contrast will depend on a few issues, notably how many possible contrasts there are.

A Tukey test gets classified as 'post-hoc' whether it is really the original intention or not because it looks at all possible pairwise contrasts. A way to think about this is that people could use 'I'll compare everything' as a get out of jail free card. You just claim that you want to test everything under the sun, and then you can say that it was all a-priori. But by virtue of comparing everything, it is equivalent to having seen your data first. The test naturally accounts for that, and the result is equivalent to a post-hoc result.

Your contrasts are clearly a-priori, and appear to be orthogonal. I think it is appropriate for you to go with the top set.