Solved – Pairwise comparisons after significant interaction results: parametric or non

anovanonparametricpost-hoc

I've run a 2-way ANOVA on growth rate data (grams/day) with the factors of year type (good and poor) and site (A and B). Though the data themselves are non-normal and do not have homogeneous variances, the residuals fall pretty nicely along the qq plot, and they are not heteroskedastic. Running the test shows that there is an interaction between year-type and site. It's been suggested to me that I now must run a series of pairwise comparisons to look for differences because of this interaction effect, which I assumed I'd need to do anyway.

My stats program (Sigmaplot11, which includes the SigmaStat package) automatically does post-hoc tests for any significant results (in this case is reports: "All Pairwise Multiple Comparison Procedures (Holm-Sidak method): Overall significance level = 0.05). I assume that this is the correct way to conduct the pairwise comparisons rather than making them separately?

Here's why I ask: Using these built-in post-hoc tests, I get significant results in both good years and poor (p<0.001 in each case). However, doing these tests separately (good vs. good and poor vs. poor) as Mann-Whitney rank sum tests (the raw data are non-normal with heterogeneous variances), I get no significant difference between sites within poor years. t-tests show the same as the ANOVA. I assume this is because I'm not comparing means in the case of the Mann-Whitney, but which post-hoc method should I be using in this case?!

EDIT: following @whuber's informative comment, I've taken a look at my other 2-way ANOVAs run in a similar manner. I've done another similar test today comparing adult weight. The t-test from the multiple pairwise comparisons after the 2-way ANOVA shows no difference (t=1.547, p=0.122), but a t-test run outside of the ANOVA shows highly significant difference (t=-4.739, p<0.001). As @chl pointed out, I would expect these 2 t-tests to have the same t-values; note that these 2 tests use exactly the same original data. Any idea why this might be or how I can interpret this?

Thanks for any suggestions you can provide!!

EDIT #2: Just to update anyone who's interested, I've taken a look more closely at the numbers behind the test, and it looks like somehow the software is not doing what I'm asking. It lists a table called the Least Square Means, listing each site, its mean and its standard error of the mean. The overall site means listed in this table are not correct. However, for each site within year type, it does have the correct means in the Least Square Means table. I'm giving up on the main factor of site comparison within this test, and sticking to the t-value given in the t-test for this main factor comparison. I'm still not sure what's going on exactly, but from comments and answers provided (thanks again!), I feel that this is a safe move, and I must move on with other things.
Thanks for all of your help!

Best Answer

If I understand your question correctly, you are wondering why you got different p-values from your t-tests when they are carried out as post-hoc tests or as separate tests. But did you control the FWER in the second case (because this is what id done with the step down Sidak-Holm method)? Because, in case of simple t-tests, the t-values won't change, unless you use a different pooling method for computing variance at the denominator, but the p-value of the unprotected tests will be lower than the corrected one.

This is easily seen with Bonferroni adjustment, since we multiply the observed p-value by the number of tests. With step-down methods like Holm-Sidak, the idea is rather to sort the null hypothesis tests by increasing p-values and correct the alpha value with Sidak correction factor in a stepwise manner ($\alpha’ = 1 - (1 - \alpha)^k$, with $k$ the number of possible comparisons, updated after each step). Note that, in contrast to Bonferroni-Holm's method, control of the FWER is only guaranteed when comparisons are independent. A more detailed description of the different kind of correction for multiple comparisons is available here: Pairwise Comparisons in SAS and SPSS.