Solved – Problem understanding the interaction term of mixed effects model

interactionlme4-nlmemixed modelr

I have a question regarding the meaning of an interaction term in a linear mixed effects model. Considering the following experiment (similar to a real one I have): there are three sites, each with two treatments in them. Some variable is measured over four seasons for the site-treatment combinations, and let's say there are 6 replicates for each combinations (i.e. n=144=3 sites x 4 seasons x 2 treatments x 6 reps).

So I made a fake dataset where I deliberately make seasonal variation not have an effect (data are all normal and with SD=1). However, the values are different for each site, BUT treatment 2 is consistently larger than treatment 1 at each site (excuse the strange way of making the data, but hey it illustrates my problem).

SO: The main effect for treatment is significant (no problem there), main effect for season non-significant (deliberate)….. but now: why is the site X treatment effect significant? My understanding was that having site as a random effect controls for added variation in sites, such that if a treatment effect is consistent across sites, even though the absolute values are different, then there should not be a site x treatment interaction. In my mind, the presence of such a significant interaction would mean that although treatment has an effect, it is not consistent and that the effect is site specific, as in if for some sites the blue and red dots would swop around.

Could someone please clarify where my misunderstanding is of this?

Many thanks in advance!

EDIT: Sorry, what I also wanted to add is that in my real experiment, we know that there are inherent site differences, which is exactly why we want to control for site effects…

library(ggplot2)
library(nlme)
library(emmeans)

sites <- c("Site1","Site2","Site3")
seasons <- c("Aut","Wint","Spr", "Sum")
treatment <- c("Treatment1","Treatment2")
rep <- as.factor(1:6) # let's say there are six replicates for each combination

means <- c(15,30,105,125,50,70)

dat <- expand.grid(Rep=rep, Treatment=treatment, Site=sites, Season=seasons) # consistent differences

L = list()
n = 0

set.seed(123)
for(b in 1:4){
    for(i in 1:6){
        # replicates creation
        L[[1 + n]] <- rnorm(6, means[i], 0.5)
        n <- length(L)} 
}

X <- unlist(L)

dat <- cbind(dat, X)
head(dat)

ggplot(data = dat, aes(x=Season, y=X, bg=Treatment, group=Treatment)) +      
    geom_point(size=2, shape=21, colour="black") +
    facet_grid(.~ Site ,  scales = "free")

anova(mod <- lme(X ~ Treatment + Season + Site:Treatment, random=~1|Site, data=dat))

OUTPUT:

               numDF denDF  F-value p-value
(Intercept)        1   133 346098.2  <.0001
Treatment          1   133  54308.8  <.0001
Season             3   133      0.5  0.7008
Treatment:Site     4   133  29089.5  <.0001

Best Answer

I think the way you formulated the model is a bit misleading: you have Site as grouping factor (i.e., there are random intercepts contingent on which site you are measuring), but then you add it also as a fixed-effect predictor, in interaction with Treatment. I don't think that the interaction term makes much sense: if there are site-specific variation in the effect of treatment, and you want to control for those random variations (site effects), then I think the model should have the following form:

> anova(mod <- lme(X ~ Treatment + Season, random=~Treatment|Site, data=dat))
            numDF denDF   F-value p-value
(Intercept)     1   137 110.94052  <.0001
Treatment       1   137 118.08932  <.0001
Season          3   137   1.89962  0.1326

This model have random intercepts (site-specific variation in intercept) and random slopes for Treatment (site-specific variation in treatment effects). Crucially, these random effects are assumed to be Gaussian distributed with mean zero. So the fixed-effect estimate of Treatment represents the estimated mean effect for your "population" of sites. The model also estimates the variance of such effect across sites:

> VarCorr(mod)
Site = pdLogChol(Treatment) 
                    Variance     StdDev      Corr  
(Intercept)         2.056268e+03 45.34608932 (Intr)
TreatmentTreatment2 8.538427e+00  2.92205871 0.792 
Residual            7.915393e-03  0.08896849

Related Solutions

Solved – How to perform post-hoc comparison on interaction term with mixed-effects model

I found the package "lsmeans" quite useful especially when there is a x*z*v interaction. However, the package is available only for newer versions of R.

http://cran.r-project.org/web/packages/lsmeans/vignettes/using-lsmeans.pdf

Solved – Interaction term in a linear mixed effect model in R

Here's what I would do:

First, I would have a look here on how to specify the random term in your model1. I am not quite sure what you are trying to fit. There is also a lot of info on linear mixed effects models here on CV. Click on the lme4-nlme tag, which you also provided. It would also help if you could provide an example dataset, or at least the structure of your data.

Then, you most likely only need one model, which is presumably in the form of:

my_model <- lmer(carbon ~ species + landuse + species : landuse + (1|site), data = mydata)

I specified the random effect to be + (1|site), because you said:

Study sites are included as the random effect in the model.

To get the ANOVA table you can either do:

library(car)
Anova(my_model)

or:

library(afex)
mixed(carbon ~ species + landuse + species : landuse + (1|site), data = mydata)

or instead of running lmer() through the lme4 package, load the lmerTest package and run:

my_model <- lmer(carbon ~ species + landuse + species : landuse + (1|site), data = mydata)
anova(my_model)

This will give you the ANOVA table you probably need eventually. Make sure to have a look at those functions and their arguments (?Anova, ?mixed, ?lmerTest::anova).

I don't quite understand why would want to exclude species if the interaction is significant and run separate models for all species?!

However, if your main effects are not significant you could consider tossing them out and re-running the model with the interaction only. However, if one or both main effects are significant, I would keep them both in the model and report this together with a potential significant interaction.

In any case, if you have a significant interaction you should focus on interpreting the interaction and not the main effects since their interpretation could now be misleading. The interpretation of the interaction should start by visualizing it. You could do this for example using the emmip() function in the emmeans package:

library(emmeans)
emmip(my_model, landuse ~ species)

Regarding the adjustment of p-values, you only need to do that if you are following up with post-hoc tests.

This could be done with the emmeans() function (also from the emmeans package):

emmeans(my_model, pairwise ~ species : landuse)

Best Answer

Related Solutions

Solved – How to perform post-hoc comparison on interaction term with mixed-effects model

Solved – Interaction term in a linear mixed effect model in R

Related Question