Solved – Post hoc test other than Tukey for factorial ANOVA data in R

anovapost-hocr

I am looking for help on post-hoc tests of my group data (treatment and stage and interaction) after running a 2 way ANOVA in R. The data shown below is an example only. Actually my data shows significant result after ANOVA test. When I tried post hoc with TukeyHSD it gives pair wise comparisons which is a long list. But I am looking for other test where I can compare treatment mean alone, stage mean alone as well as stage*treatment with letter after mean showing significant or non-significant in group. Could there be other post-hoc test which fulfils my need?

My data looks like:

Treatment   Stage   Chlorophyll
Salt        Green   0.2
Salt        Green   0.3
Salt        Green   0.4
Salt        Pink    0.5
Salt        Pink    0.3
Salt        Pink    0.2
Salt        Red     0.5
Salt        Red     0.6
Salt        Red     0.7
Nitrogen    Green   0.4
Nitrogen    Green   0.6
Nitrogen    Green   0.9
Nitrogen    Pink    0.2
Nitrogen    Pink    0.3
Nitrogen    Pink    0.5
Nitrogen    Red     0.4
Nitrogen    Red     0.2
Nitrogen    Red     0.3
Control     Green   0.5
Control     Green   0.6
Control     Green   0.8
Control     Pink    0.5
Control     Pink    0.4
Control     Pink    0.6
Control     Red     0.2
Control     Red     0.3
Control     Red     0.1

Best Answer

You might want to look at the lsmeans package or the multcomp package. Here is some demonstration R code to show you some of the options you have.

Load both packages. Also, the car library provides some functionality for carrying out analysis of variance that is sometimes helpful.

# Load handy packages.
library(car)
library(lsmeans)
library(multcomp)

Then read the data and check what shape it arrived.

# Read the data.
Example <- read.csv("Example.csv")
# What is the structure of the data.
str(Example)

Looking at the structure of the data, the "Control" value of Treatment is the first level of the factor. That is handy because that is usually the default reference value.

'data.frame':   27 obs. of  3 variables:
 $ Treatment  : Factor w/ 3 levels "Control","Nitrogen",..: 3 3 3 3 3 3 3 3 3 2 ...
 $ Stage      : Factor w/ 3 levels "Green","Pink",..: 1 1 1 2 2 2 3 3 3 1 ...
 $ Chlorophyll: num  0.2 0.3 0.4 0.5 0.3 0.2 0.5 0.6 0.7 0.4 ...

Perform the analysis and a little bit of model-checking.

# Fit the two-way factorial model.
fit <- lm(Chlorophyll ~ Treatment + Stage + Treatment:Stage, Example)
# Look at the model goodness of fit.
plot(fit)
shapiro.test(residuals(fit))

The model fits pretty well. Good job on the fake data set! Now look at the results.

# Perform an analysis of variance.
Anova(fit)

No matter how you look at it, there is a two-way interaction effect in these data.

Anova Table (Type II tests)

Response: Chlorophyll
                 Sum Sq Df F value   Pr(>F)   
Treatment       0.00519  2  0.1273 0.881279   
Stage           0.12741  2  3.1273 0.068283 . 
Treatment:Stage 0.53259  4  6.5364 0.001972 **
Residuals       0.36667 18                    
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

We can use the lsmip() function to plot the interaction effects.

# Look at interaction effects via least squares means.
lsmip(fit, Stage ~ Treatment)

If we ignore the interaction with Stage, we can look at all pairwise comparisons of Treatment using the Tukey adjustment for multiple testing. The [[2]] just picks off the second part of the list of results.

# Look at Treatment averaged over levels of Stage.
lsmeans(fit, pairwise ~ Treatment)[[2]]

However, for these data, it would be best to take into account the interaction. We can "slice" the interaction by Stage and compare all levels of Treatment to each other for each Stage.

# Compare sliced least squares means using the Tukey method.
fit.tukey <- lsmeans(fit, pairwise ~ Treatment | Stage)[[2]]
fit.tukey
cld(fit.tukey)

The cld() function provides the usual letter codes. We can see the interaction effect in the differential pattern of significant differences among the groups.

Stage = Green:
 contrast                estimate        SE df t.ratio p.value .group
 Control - Nitrogen -1.188445e-17 0.1165343 18   0.000  1.0000  1    
 Control - Salt      3.333333e-01 0.1165343 18   2.860  0.0268   2   
 Nitrogen - Salt     3.333333e-01 0.1165343 18   2.860  0.0268  12   

Stage = Pink:
 contrast                estimate        SE df t.ratio p.value .group
 Nitrogen - Salt     3.962759e-17 0.1165343 18   0.000  1.0000  1    
 Control - Nitrogen  1.666667e-01 0.1165343 18   1.430  0.3470  1    
 Control - Salt      1.666667e-01 0.1165343 18   1.430  0.3470  1    

Stage = Red:
 contrast                estimate        SE df t.ratio p.value .group
 Control - Salt     -4.000000e-01 0.1165343 18  -3.432  0.0079  1    
 Nitrogen - Salt    -3.000000e-01 0.1165343 18  -2.574  0.0478  12   
 Control - Nitrogen -1.000000e-01 0.1165343 18  -0.858  0.6727   2   

P value adjustment: tukey method for a family of 3 means 
significance level used: alpha = 0.05 

Using Dunnett's test might be better if you don't care about all pairwise comparisons. This code compares against the reference level mentioned above, but you have much more flexibility if needed.

# Compare sliced least squares means via Dunnett's method.
fit.lsm <- lsmeans(fit, "Treatment", by=c("Stage"))
contrast(fit.lsm, "trt.vs.ctrl")

Results are pretty similar to those obtained using the Tukey method in this case.