In a linear mixed model, can I put continuous covariates and a random effect if they are measured at the same level

lme4-nlmemixed modelmultilevel-analysisrandom-effects-modelrepeated measures

I collected growth data on trees on 15 treated and 15 untreated sites. We have 10 trees by site and each tree has 7 years of data (2100 rows). The treatment was applied after year 5 so we decided to analyse our data like we would do with a BACI design (BA= before VS after; CI= control VS impact). In a dummy variable, we identified the years before the treatment with the value B and the years after with an A. We did the same thing with the impacted (I) and control (C) sites. In a BACI model, we just need to know if the interaction term BA:CI is significant, but I would like to add other growth predictors along with the interaction term. My basic model looks like this:

mod1<-lmer(growth~BA*CI+(1|site/treeID)+(1|year))

I have two growth predictors that I would like to add to this model, the 2020 diameter value and the percentage of canopy openness. These two variables have one unique value per tree, not one per year per tree. I would like my model to look like this:

mod2<-lmer(growth~BA*CI+diameter+canop_open+(1|site/treeID)+(1|year))

I asked colleagues and some did not see a problem, but some told me that I should not put a random effect for my trees if I want to look at the effect of covariates measured at the tree level. I should either drop the covariates and focus on the BA*CI term with my random effects or keep the covariates but drop the random effect on my trees. Because of the repeated measures structure, I think I have to keep my random effect related to my trees so I would be forced to remove my covariates?

My question is: Can I keep my second model or should I drop the predictors?

Best Answer

we just need to know if the interaction term BA:CI is significant

Please try not to be too concerned with statistical significance. Suppose your p-value was 0.04999 and your significance level is 0.05. So this would be a "significant" result. But what if the p-value was 0.05001, perhaps because 1 observation was missing (p-values are a function of the sample size when the null hyothesis is false). Obviously it would be absurd to make different decisions based on the 0.05 level.

some told me that I should not put a random effect for my trees if I want to look at the effect of covariates measured at the tree level.

That does not make sense to me at all - did they give some justification for such an assertion ? There is nothing at all from stopping you fitting fixed effects that vary at the same level as one of the random effects. The majority of mixed effects/multilevel models have such covariates and I can't think of anything in the underlying theory that would suggest you can't. Covariates can vary within, or between, a grouping variable.

My question is: Can I keep my second model or should I drop the predictors?

The only reason I can see to drop these covariates is, provided that the model converges properly, if either one of them causes, or a is a proxy for a cause of, either BA or CI, in which case they would be mediators and should not be included if you want estimates of the total causal effect of BA,CI and their interaction

Related Solutions

Solved – Interactions between random effects

Have you tried it? That sounds like it should be fine.

set.seed(101)
## generate fully crossed design:
d <- expand.grid(Year=2000:2010,Site=1:30)
## sample 70% of the site/year comb to induce lack of balance
d <- d[sample(1:nrow(d),size=round(0.7*nrow(d))),]
## now get Poisson-distributed number of obs per site/year
library(plyr)
d <- ddply(d,c("Site","Year"),transform,rep=seq(rpois(1,lambda=10)))
library(lme4)
d$ticks <- simulate(~1+(1|Year)+(1|Site)+(1|Year:Site),
                    family=poisson,newdata=d,
                    newparams=list(beta=2, ## mean(log(ticks))=2
                               theta=c(1,1,1)))[[1]]
mm <- glmer(ticks~1+(1|Year)+(1|Site)+(1|Year:Site),
                    family=poisson,data=d)

We get out approximately what we put in -- equal variances at each level:

## Generalized linear mixed model fit by maximum likelihood (Laplace
##   Approximation) [glmerMod]
##  Family: poisson  ( log )
## Formula: ticks ~ 1 + (1 | Year) + (1 | Site) + (1 | Year:Site)
##    Data: d
## 
##      AIC      BIC   logLik deviance df.resid 
##  12487.3  12510.2  -6239.7  12479.3     2267 
## 
## Scaled residuals: 
##     Min      1Q  Median      3Q     Max 
## -2.9944 -0.6842 -0.0726  0.6010  3.8532 
## 
## Random effects:
##  Groups    Name        Variance Std.Dev.
##  Year:Site (Intercept) 1.0818   1.0401  
##  Site      (Intercept) 1.0490   1.0242  
##  Year      (Intercept) 0.9787   0.9893  
## Number of obs: 2271, groups:  Year:Site, 231 Site, 30 Year, 11
## 
## Fixed effects:
##             Estimate Std. Error z value Pr(>|z|)    
## (Intercept)   2.1952     0.3593   6.109    1e-09 ***
## ---
## Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1

You may want to include an observation-level random effect to allow for overdispersion (see the "grouse ticks" example in http://rpubs.com/bbolker/glmmchapter)

Solved – Mixed effects model with random and nested effects in lmer

Please read the R GLMM FAQ: http://glmm.wikidot.com/faq especially the section on "should I treat factor xxx as fixed or random?" or the email thread: https://stat.ethz.ch/pipermail/r-sig-mixed-models/2010q2/003710.html

If I understand your data correctly, you have only 2 species, so estimating a random variance for species is problematic at best. You have 6 total accessions (3+3), which is about the minimum for 1 variance component.

Best Answer

Related Solutions

Solved – Interactions between random effects

Solved – Mixed effects model with random and nested effects in lmer

Related Question