Solved – Post hoc test for factorial ANCOVA in R

ancovaanovapost-hocr

I have done a three way ANCOVA in R. I am testing how temperature, the development stage and the size of a carcass affect the development rate of maggots.

My response variable is Duration (a measurement of hours) and my factors are Size (2 levels = small and large), and Stage (7 levels = eggs, 1st instar, 2nd instar, 3rd instar, postfeed, pupa and total) and Temperature (4 levels = 15, 20, 25, 30). Size and Stage are fixed factors. However, Temperature is a continuous factor (I am analysing it as a covariate).
My function is:

Duration1 <-(aov(Duration~Stage*Temperature*Size, ThreeWayDuration))
summary(Duration1)

This gave me:

                        Df  Sum Sq Mean Sq F value   Pr(>F)    
Stage                    6 7206782 1201130 149.059  < 2e-16 ***
Temperature              1 1924926 1924926 238.881  < 2e-16 ***
Size                     1   78491   78491   9.741  0.00205 ** 
Stage:Temperature        6 2500293  416716  51.714  < 2e-16 ***
Stage:Size               6  120539   20090   2.493  0.02367 *  
Temperature:Size         1  140090  140090  17.385 4.43e-05 ***
Stage:Temperature:Size   6  184679   30780   3.820  0.00122 ** 
Residuals              214 1724431    8058                     
---
Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1

I want to do post hoc tests. I know I do not need a post hoc test for Size because it only has two levels.
I have used the TukeyHSD () function for Stage:

TukeyHSD(Duration1, "Stage")

This gave the diff, lwr, upr, and p adj values for each interaction. However there is a warning message:

Warning messages:       
1: In replications(paste("~", xx), data = mf) : 
  non-factors ignored: Temperature        
2: In replications(paste("~", xx), data = mf) :
  non-factors ignored: Stage, Temperature    
3: In replications(paste("~", xx), data = mf) :
  non-factors ignored: Temperature, Size    
4: In replications(paste("~", xx), data = mf) :
  non-factors ignored: Stage, Temperature, Size        

What does this mean? With these warnings has it given me the correct values?

I also understand that the Tukey test only works for categorical factors.
What post hoc test can I use to examine Temperature in R, as this is a continuous factor?
Many thanks

Best Answer

First of all, you have lots of interactions in play here, so it is very likely highly misleading to just do simple marginal comparisons. That said, the summary results are actually a sequential anova table, with terms entered in the order shown in the rows of the table. You might want to try

library("car")
Anova(Duration1)   # note the capital "A" in "Anova"

Each test in this table is conditional on all other terms in the model that do not contain the one in question.

Second, I am still concerned that you get the model right before proceeding to post-hoc stuff. Many people try to find the shortest route to a P value and -- in the process -- base their inferences on a model that doesn't fit the data. If that's the case here, then that's another way to be doing meaningless things. Did you do any residual plots? Did you explore whether the linear trend in Temperature is reasonable?

More specifics on these matters... It is sometimes the case that a response transformation will improve the residual distribution and also remove or reduce interactions. Your response is measured on a ratio scale, and it may be that a log or square root or even a reciprocal makes for a better-fitting model. Changing the model so that Temperature is replaced by poly(Temperature, 2) or poly(temperature, 3) would make it possible to see if the quadratic or cubic effects are needed.

Also, since Size is a factor, I hope you coded it as a factor in your model even though it has only two levels. That doesn't change the anova at all, but it makes the results easier to interpret. In what follows, I am assuming that Size is of class "factor".

With two factors and a covariate involved, perhaps TukeyHSD can be made to work right, but I will instead shamelessly suggest using my lsmeans package, which is designed for multi-factor situations. LS means are the model's predictions over a regular grid of factor combinations, or marginal averages thereof. See the documentation for details.

First off, get an idea visually of what is going on. This can be done with an interaction-plot-style display:

library("lsmeans")
lsmip(Duration1, Stage ~ Temperature | Size, 
      at = list(Temperature = c(15, 20, 25, 30)) )

You'll see the temperature trend for each stage and size. Since there are interactions, these lines will be of different slopes. If there is, however, some regularity among these slopes, it may be reasonable to ignore some of the interactions even though they are statistically significant. (Also, if larger predictions tend to differ more than smaller predictions, that's a situation that shows some hope of being ameliorated by a response transformation.) If the lines go all over the place, then you can't ignore the interactions. You might also want to look at other plots (e.g., with Temperature ~ Stage | Size) to get different perspectives.

Now, to compare sizes or stages, we should (probably) do that separately for each combination of the other two factors. For example:

Size.lsm = lsmeans(Duration1, "Size", by = c("Temperature", "Stage"),
                    at = list(Temperature = c(15, 20, 25, 30)) )
Size.lsm
contrast(Size.lsm, "pairwise")   # or just  pairs(Size.lsm)

From the above, you'll actually get 28 tables of means, and 28 tables of comparisons. [Do test(pairs(Size.lsm), by = NULL, adjust = "mvt") to get one table with all 28 comparisons and a multi-variate $t$ adjustment for the 28 tests.]

You could do pairwise comparisons of the 4 temperatures in an analogous way. However, since this is a quantitative factor, you could just estimate and compare the slopes of the lines in the plot produced earlier, like this:

Temp.lst = lstrends(Duration1, "Stage", by = "Size", var = "Temperature")
Temp.lst
pairs(Temp.lst)
pairs(Temp.lst, by = "Stage")   # compares the two sizes at each stage

This is long-winded, but I hope it gives you an idea of how you might proceed. Again, get the model right first before proceeding to any of the post-hoc analyses.