Solved – Generalized linear mixed model – glmmADMB – date as random effect

generalized linear modelmixed modelr

I have a couple related questions about using a generalize linear mixed effects model to analyze data from an agricultural field experiment. I have found several posts that are similar to this question, but nothing that quite gets at my points of confusion.

The experiment involved 4 treatments replicated 4 times each: one treatment per field plot, for a total of 16 field plots.

On multiple dates (12) throughout the season I measured a response (counts) and a few covariates at all 16 plots. Within each plot on each date I would make multiple measurements of my response and covariates. So, measurements within a plot on a given date are non-independent (spatial pseudoreplication), and measurements through time for each plot are non-independent (temporal pseudoreplication). OK, so a linear mixed model seems appropriate.

My response data roughly fit a negative binomial distribution. OK, so a generalized linear mixed model seems appropriate.

I want to account for the variation that occurs within a plot and also the variation that occurs on different dates.

Would it be valid for me to model it this way (treating date as a factor)?

#fake data; each plot is sampled at 4 locations on 12 different dates
response=rnbinom(768,size=0.5,mu=2)
treatment=factor(rep(c("one","two","three","four"),times=192))
covariate=rnbinom(768,size=.01,mu=4)
date=factor(rep(1:12,each=64))
plotID=factor(rep(c("P1","P2","P3","P4","P5","P6","P7","P8","P9","P10","P11",
            "P12","P13","P14","P15","P16"),each=4,times=12))

#a model with random effects on the intercept
library(glmmADMB)
model1=glmmadmb(response~covariate+treatment+(1|date\plotID),family="nbinom2")

My concern is that, while plots are independent and randomly distributed, the dates are not. That is, I expect that on a given date the variance structure within a plot will be similar, but not between plots; however, I expect the variance structure to be similar across plots on a given date. Does this create problems for how I modeled the data above?

Am I even correct in thinking that the model would account for the variance that occurs on each date (1|date) and the variation that occurs within each plot on a given date (1|date:plotID)?

I could model date as a continuous fixed effect, but for reasons I won't go into I'd prefer not to do that. Regardless, I have no interest in the effects of time on my response, other than as a source of variation that I'd like to account for.

I hope my thinking about this whole thing doesn't reveal any TOO glaring gaps in understanding 🙂

Thanks in advance for the help, oh wise internet community!!

Best Answer

A nested random-effects formula of the form (1|date/plotID) (forward slash /, not backslash \) denotes "plot nested within date", i.e. responses vary across dates and across plots within dates. This is exactly equivalent to (and in fact is internally expanded to) (1|date)+(1|date:plotID), i.e. random effects of date and of a date-by-plot interaction. Given your crossed experimental design (each plot is measured on multiple dates, multiple plots are measured on each date, and there is more than one observation per date-plot combination), you can add a (1|plotID) term to the model.

You say

My concern is that, while plots are independent and randomly distributed, the dates are not. That is, I expect that on a given date the variance structure within a plot will be similar, but not between plots; however, I expect the variance structure to be similar across plots on a given date. Does this create problems for how I modeled the data above?

I don't quite understand this, specifically what you mean by "variance structure". Most mixed models will assume in this scenario that within-plot variance is the same across plots, that different plots have the same temporal variance, and that different times have the same among-plot variance ... I don't know what the "variance structure between plots" means ... you can relax these assumptions if you absolutely need to, but (1) it can get tricky to program and (2) it's going to get hard to estimate all those separate variances.

One final thing to consider is that your experimental design in principle allows you to fit random effects of the form (treatment|plotID), i.e. looking at variation in treatment effects across plots. Schielzeth and Forstmeier 2008 comment that leaving these terms out in some cases (i.e. fitting random-intercept rather than random-slope models) may have overestimated treatment effects in some cases.

Related Solutions

R – Using lmer for Repeated-Measures Linear Mixed-Effect Model

I think that your approach is correct. Model m1 specifies a separate intercept for each subject. Model m2 adds a separate slope for each subject. Your slope is across days as subjects only participate in one treatment group. If you write model m2 as follows it's more obvious that you model a separate intercept and slope for each subject

m2 <- lmer(Obs ~ Treatment * Day + (1+Day|Subject), mydata)

This is equivalent to:

m2 <- lmer(Obs ~ Treatment + Day + Treatment:Day + (1+Day|Subject), mydata)

I.e. the main effects of treatment, day and the interaction between the two.

I think that you don't need to worry about nesting as long as you don't repeat subject ID's within treatment groups. Which model is correct, really depends on your research question. Is there reason to believe that subjects' slopes vary in addition to the treatment effect? You could run both models and compare them with anova(m1,m2) to see if the data supports either one.

I'm not sure what you want to express with model m3? The nesting syntax uses a /, e.g. (1|group/subgroup).

I don't think that you need to worry about autocorrelation with such a small number of time points.

Solved – repeated-measures linear mixed-effect model

Is it correct to consider this as a nested two-level repeated measures ANOVA?

No, it is a linear mixed effects model.

Is it correct to use the following R syntax?

m1 <- lmer(CO2~days*treatment+(days|replicates),mydata)

It is correct if you wish to account for clustering, ie repeated measures, within each replicate, and allow the slope of days to vary between replicates.

Is this a random intercept and slope model with replicates nested in treatment?

It is a random intercept and slope model, but observations are nested within replicates.

Is it correct to say that this model accounts for:

the main effect of treatment and time and

the interaction between the two?

Yes, these are modeled as fixed effects.

What would be the difference between m1 and m2?

m2 <- lmer(CO2~days*treatment+(1|replicates),mydata)

m2 is a random intercept model, it accounts for clustering (repeated measures) but the slopes, are assumed to be constant for east replicate. m1 allows the slope for days to vary between clusters (replicates in your case).

Best Answer

Related Solutions

R – Using lmer for Repeated-Measures Linear Mixed-Effect Model

Solved – repeated-measures linear mixed-effect model

Related Question