I have done a three way ANCOVA in R. I am testing how temperature, the development stage and the size of a carcass affect the development rate of maggots.
My response variable is Duration
(a measurement of hours) and my factors are Size
(2 levels = small and large), and Stage
(7 levels = eggs, 1st instar, 2nd instar, 3rd instar, postfeed, pupa and total) and Temperature
(4 levels = 15, 20, 25, 30). Size
and Stage
are fixed factors. However, Temperature
is a continuous factor (I am analysing it as a covariate).
My function is:
Duration1 <-(aov(Duration~Stage*Temperature*Size, ThreeWayDuration))
summary(Duration1)
This gave me:
Df Sum Sq Mean Sq F value Pr(>F)
Stage 6 7206782 1201130 149.059 < 2e-16 ***
Temperature 1 1924926 1924926 238.881 < 2e-16 ***
Size 1 78491 78491 9.741 0.00205 **
Stage:Temperature 6 2500293 416716 51.714 < 2e-16 ***
Stage:Size 6 120539 20090 2.493 0.02367 *
Temperature:Size 1 140090 140090 17.385 4.43e-05 ***
Stage:Temperature:Size 6 184679 30780 3.820 0.00122 **
Residuals 214 1724431 8058
---
Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
I want to do post hoc tests. I know I do not need a post hoc test for Size
because it only has two levels.
I have used the TukeyHSD () function for Stage
:
TukeyHSD(Duration1, "Stage")
This gave the diff, lwr, upr, and p adj values for each interaction. However there is a warning message:
Warning messages:
1: In replications(paste("~", xx), data = mf) :
non-factors ignored: Temperature
2: In replications(paste("~", xx), data = mf) :
non-factors ignored: Stage, Temperature
3: In replications(paste("~", xx), data = mf) :
non-factors ignored: Temperature, Size
4: In replications(paste("~", xx), data = mf) :
non-factors ignored: Stage, Temperature, Size
What does this mean? With these warnings has it given me the correct values?
I also understand that the Tukey test only works for categorical factors.
What post hoc test can I use to examine Temperature
in R, as this is a continuous factor?
Many thanks
Best Answer
First of all, you have lots of interactions in play here, so it is very likely highly misleading to just do simple marginal comparisons. That said, the
summary
results are actually a sequential anova table, with terms entered in the order shown in the rows of the table. You might want to tryEach test in this table is conditional on all other terms in the model that do not contain the one in question.
Second, I am still concerned that you get the model right before proceeding to post-hoc stuff. Many people try to find the shortest route to a P value and -- in the process -- base their inferences on a model that doesn't fit the data. If that's the case here, then that's another way to be doing meaningless things. Did you do any residual plots? Did you explore whether the linear trend in
Temperature
is reasonable?More specifics on these matters... It is sometimes the case that a response transformation will improve the residual distribution and also remove or reduce interactions. Your response is measured on a ratio scale, and it may be that a log or square root or even a reciprocal makes for a better-fitting model. Changing the model so that
Temperature
is replaced bypoly(Temperature, 2)
orpoly(temperature, 3)
would make it possible to see if the quadratic or cubic effects are needed.Also, since
Size
is a factor, I hope you coded it as a factor in your model even though it has only two levels. That doesn't change the anova at all, but it makes the results easier to interpret. In what follows, I am assuming thatSize
is of class"factor"
.With two factors and a covariate involved, perhaps
TukeyHSD
can be made to work right, but I will instead shamelessly suggest using my lsmeans package, which is designed for multi-factor situations. LS means are the model's predictions over a regular grid of factor combinations, or marginal averages thereof. See the documentation for details.First off, get an idea visually of what is going on. This can be done with an interaction-plot-style display:
You'll see the temperature trend for each stage and size. Since there are interactions, these lines will be of different slopes. If there is, however, some regularity among these slopes, it may be reasonable to ignore some of the interactions even though they are statistically significant. (Also, if larger predictions tend to differ more than smaller predictions, that's a situation that shows some hope of being ameliorated by a response transformation.) If the lines go all over the place, then you can't ignore the interactions. You might also want to look at other plots (e.g., with
Temperature ~ Stage | Size
) to get different perspectives.Now, to compare sizes or stages, we should (probably) do that separately for each combination of the other two factors. For example:
From the above, you'll actually get 28 tables of means, and 28 tables of comparisons. [Do
test(pairs(Size.lsm), by = NULL, adjust = "mvt")
to get one table with all 28 comparisons and a multi-variate $t$ adjustment for the 28 tests.]You could do pairwise comparisons of the 4 temperatures in an analogous way. However, since this is a quantitative factor, you could just estimate and compare the slopes of the lines in the plot produced earlier, like this:
This is long-winded, but I hope it gives you an idea of how you might proceed. Again, get the model right first before proceeding to any of the post-hoc analyses.