Mixed Models – Minimum Repeated Measures and Levels per Nested Random Effect

lme4-nlmemixed modelnested datarandom-effects-modelrepeated measures

I often read the guideline that a random factor should at least have 5-6 levels. However, it is not yet really clear to me if there is (i) a minimum number of levels for a nested factor within a block and (ii) whether there is a minimum of measurements per individual.
For instance, I have a BACI (Before After Control Impact) model that is specified like this in the lme4 package in R:

lmer(y ~ treatment * period + (1| site/block/subject), data = mydata)

I have

2 treatment groups
2 periods (before and after exposure to the treatment/placebo)
64 subjects with 2 measurements (before and after exposure)
16 blocks (8 per treatment)
8 sites (with one treated and one control block).

This means, I have only

2 repeated measures per subject &
2 blocks per site.

Does the small number of blocks and repeated measures present a problem when the total number of blocks (16) and subjects (64) is large or more generally:

Q1: Is there a minimum number of levels of a nested random factor within the factor where it is nested in?
Q2: Is there a minimum number of repeated measures per subject?

My personal (layman’s) opinion:

I personally believe that the small number of blocks per site does not represent a problem, because this book chapter on models with multiple random-effects shows an example with a random factor with 30 levels (samples), but only 3 within each block (batch). I find this also intuitive because I imagine there are still 30 (and not just 3) values to estimate the distribution of the effect (even if it has to be estimated in reference to the batch). That’s however just my imagination, I have very little understanding on how it actually works. In addition, this article advocates the maximal random effects structure justified by the design.
Similarly I would believe that it is okay to specify a random factor with only two data points per subject (but multiple levels).

However, I have no real understanding of this and a colleague who gives statistical courses told me not to. Since, I have never read a guideline about this, I am asking here.

Best Answer

I agree with your reasoning, but it makes it easier to think about when we remember that:

(1| site/block/subject)

is the same as

(1| site) + (1|site:block) + (1|site:block:subject)

So, the limiting number of levels for each factor only applies to the "top" level - that is, site in this case. Here we have 8, sites, so that is OK.

Obviously regardless of how many levels we have for block and subject, the other two grouping terms will have more than 8 levels, so all is good here.

Related Solutions

R – Using lmer for Repeated-Measures Linear Mixed-Effect Model

I think that your approach is correct. Model m1 specifies a separate intercept for each subject. Model m2 adds a separate slope for each subject. Your slope is across days as subjects only participate in one treatment group. If you write model m2 as follows it's more obvious that you model a separate intercept and slope for each subject

m2 <- lmer(Obs ~ Treatment * Day + (1+Day|Subject), mydata)

This is equivalent to:

m2 <- lmer(Obs ~ Treatment + Day + Treatment:Day + (1+Day|Subject), mydata)

I.e. the main effects of treatment, day and the interaction between the two.

I think that you don't need to worry about nesting as long as you don't repeat subject ID's within treatment groups. Which model is correct, really depends on your research question. Is there reason to believe that subjects' slopes vary in addition to the treatment effect? You could run both models and compare them with anova(m1,m2) to see if the data supports either one.

I'm not sure what you want to express with model m3? The nesting syntax uses a /, e.g. (1|group/subgroup).

I don't think that you need to worry about autocorrelation with such a small number of time points.

Solved – repeated-measures linear mixed-effect model

Is it correct to consider this as a nested two-level repeated measures ANOVA?

No, it is a linear mixed effects model.

Is it correct to use the following R syntax?

m1 <- lmer(CO2~days*treatment+(days|replicates),mydata)

It is correct if you wish to account for clustering, ie repeated measures, within each replicate, and allow the slope of days to vary between replicates.

Is this a random intercept and slope model with replicates nested in treatment?

It is a random intercept and slope model, but observations are nested within replicates.

Is it correct to say that this model accounts for:

the main effect of treatment and time and

the interaction between the two?

Yes, these are modeled as fixed effects.

What would be the difference between m1 and m2?

m2 <- lmer(CO2~days*treatment+(1|replicates),mydata)

m2 is a random intercept model, it accounts for clustering (repeated measures) but the slopes, are assumed to be constant for east replicate. m1 allows the slope for days to vary between clusters (replicates in your case).

Best Answer

Related Solutions

R – Using lmer for Repeated-Measures Linear Mixed-Effect Model

Solved – repeated-measures linear mixed-effect model

Related Question