Solved – How to include nested fixed effects with different levels across conditions in lmer

lme4-nlmemixed modelrrandom-effects-modelrepeated measures

My design is as follows:

one dependent variable (brain activity),
a "condition" factor I manipulated with two levels (c1 and c2)
a "region of interest" factor with two levels (r1 and r2) *note that not all subjects have data in both regions
a "subregion" factor with one level for r1 and three levels for r2 – this subregion is nested with roi. For example, region of interest can be visual cortex and auditory cortex. And I only have V1 as a subregion for visual cortex, but A1, A2 and A3 as subregions for auditory cortex.

    subject roi  subroi activity condition
    s1      n1     v1    0.34      c1
    s1      n1     v1    0.68      c2
    s2      n2     a2    0.18      c1
    s2      n2     a3    1.27      c1
    s2      n2     a2   -0.77      c2
    s2      n2     a3    0.16      c2
    s3      n2     a1    0.12      c1
    s3      n2     a1    0.42      c2

I'm mostly interested in the interaction between condition and roi, but would like to account for the fact that there are subregions (subroi) nested under roi.

Another important point is that each subject has a different number of electrodes, so I would need to account for shared variance within subjects.

I have two questions about this design:

Is the following model correct?
lmer(activity ~ cond*roi + (1|roi:subroi) + (1|subject), data=dat)
If I do find a significant cond*roi interaction, can I run follow up analyses within each roi in the following manner:
for n1 (with 1 level of subroi): lmer(activity ~ cond + (1|subject), data=dat)
for n2 (with 3 levels of subroi): lmer(activity ~ cond*roi + (1|subject), data=dat)

I have come across several websites on nested effects, but most assume that the nested factor (subroi) has the same levels across the main factor (roi). The above code doesn't look exactly right to me given the asymmetrical levels of subroi across n1 and n2, but I cannot think of another way to account for that. Any thoughts/suggestions would be much appreciated!

Best Answer

Note I have changed patient to subject in your question, to be consistent with the data excerpt. I hope that is correct.

For your first question:

Is the following model correct? lmer(activity ~ cond*roi + (1|roi:subroi) + (1|subject), data=dat)

The problem with this is that you are specifying a random effect (in particular, random intercepts) for the roi:subroi interaction. However, this has only 4 levels, which is insufficient, at least in a frequentist framework such as lmer, as it's variance will be estimated poorly and numerical problems may also occur. Furthermore, this interaction is not specified as a fixed effect (which would not be possible anyway in the presence of roi as a fixed effect, due to the separation of levels between roi and subroi) so it would force the interaction to have an overall mean of zero (since the random effects have a mean of zero). So it would be better to remove (1|roi:subroi).

If I do find a significant cond*roi interaction, can I run follow up analyses within each roi in the following manner: for n1 (with 1 level of subroi): lmer(activity ~ cond + (1|subject), data=dat) for n2 (with 3 levels of subroi): lmer(activity ~ cond*roi + (1|subject), data=dat)

Neither model makes sense to me. For one thing, you are losing power by splitting the dataset. For another, with lmer(activity ~ cond + (1|subject), data=dat) you already have an estimate for the fixed effect of cond along with it's interaction with roi from your original model so this won't tell you anything new. For lmer(activity ~ cond*roi + (1|subject), data=dat) since there is now only 1 level of roi this will generate an error along the lines of contrasts can be applied only to factors with 2 or more levels and if you remove roi from the that model then my previous comment also applies. I wonder if you perhaps meant lmer(activity ~ cond*subroi + (1|subject), data=dat) which would make more sense.

Related Solutions

Solved – Does it make sense for a fixed effect to be nested within a random one, or how to code repeated measures in R (aov and lmer)

In mixed models the treatment of factors as either fixed or random, particularly in conjunction with whether they are crossed, partially crossed or nested can lead to a lot of confusion. Also, there appears to be differences in terminology between what is meant by nesting in the anova/designed experiments world and mixed/multilevel models world.

I don't profess to know all the answers, and my answer won't be complete (and may produce further questions) but I will try to address some of the issues here:

Does it make sense for a fixed effect to be nested within a random one, or how to code repeated measures in R (aov and lmer)?

(the question title)

No, I don't believe this makes sense. When we are dealing with repeated measures, then usually whatever the thing is that the measures are repeated on will be random, let's just call it Subject, and in lme4 we will want to include Subject on the right side of one or more | in the random part of the formula. If we have other random effects, then these are either crossed, partially crossed or nested - and my answer to this question addresses that.

The issue with these anova-type designed experiments seems to be how to deal with factors that would normally be thought of as fixed, in a repeated measures situation, and the questions in the body of the OP speak to this:

Why Error(subject/A) and not Error(subject)?

I do not usually use aov() so I could be missing something but, for me the Error(subject/A) is very misleading in the case of the linked question. Error(subject) in fact leads to exactly the same results.

Is it (1|subject) or (1|subject)+(1|A:subject) or simply (1|A:subject)?

This relates to this question. In this case, all the following random effects formulations lead to exactly the same result:

(1|subject)
(1|A:subject)
(1|subject) + (1|A:subject)
(1|subject) + (1|A:subject) + (1|B:subject)

However, this is because the simulated dataset in the question has no variation within anything, it is just created with Y = rnorm(48). If we take a real dataset such as the cake dataset in the lme4, we find that this will not generally be the case. From the documentation, here is the experimental setup:

Data on the breakage angle of chocolate cakes made with three different recipes and baked at six different temperatures. This is a split-plot design with the recipes being whole-units and the different temperatures being applied to sub-units (within replicates). The experimental notes suggest that the replicate numbering represents temporal ordering.

A data frame with 270 observations on the following 5 variables.

replicate a factor with levels 1 to 15

recipe a factor with levels A, B and C

temperature an ordered factor with levels 175 < 185 < 195 < 205 < 215 < 225

temp numeric value of the baking temperature (degrees F).

angle a numeric vector giving the angle at which the cake broke.

So, we have repeated measures within replicate, and we are also interested in the fixed factors recipe and temperature (we can ignore temp since this is just a different coding of temperature), and we can visualise the situation using xtabs:

> xtabs(~recipe+replicate,data=cake)

     replicate
recipe 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15
     A 6 6 6 6 6 6 6 6 6  6  6  6  6  6  6
     B 6 6 6 6 6 6 6 6 6  6  6  6  6  6  6
     C 6 6 6 6 6 6 6 6 6  6  6  6  6  6  6

If recipe were a random effect we would say that these are crossed random effects. In no way does recipe A belong to replicate 1 or any other replicate.

> xtabs(~temp+replicate,data=cake)

     replicate
temp  1 2 3 4 5 6 7 8 9 10 11 12 13 14 15
  175 3 3 3 3 3 3 3 3 3  3  3  3  3  3  3
  185 3 3 3 3 3 3 3 3 3  3  3  3  3  3  3
  195 3 3 3 3 3 3 3 3 3  3  3  3  3  3  3
  205 3 3 3 3 3 3 3 3 3  3  3  3  3  3  3
  215 3 3 3 3 3 3 3 3 3  3  3  3  3  3  3
  225 3 3 3 3 3 3 3 3 3  3  3  3  3  3  3

Similarly for temp.

So the first model we might fit is:

> lmm1 <-  lmer(angle ~ recipe * temperature + (1|replicate), cake, REML= FALSE)

This will treat each replicate as the only source of random variation (other than the residual of course). But there could be random differences between recipes. So we might be tempted to include recipe as another (crossed) random effect but that would be ill-advised because we have only 3 levels of recipe so we can't expect the model to estimate the variance components well. So instead we can use replicate:recipe as the grouping variable which will enable us to treat each combination of replicate and recipe as a separate grouping factor. So whereas with the above model we would have 15 random intercepts for the levels of replicate we will now have 45 random intercepts for each of the separate combinations:

lmm3 <-  lmer(angle ~ recipe * temperature + (1|replicate:recipe) , cake, REML= FALSE)

Note that we now have (very slightly) different results indicating that there is some random variability due to recipe, but not a great deal.

We could likewise do the same thing with temperature.

Now, going back to your question, you also ask

Why (1|subject) + (1|A:subject) and not (1|subject) + (0+A|subject) or even simply (A|subject)?

I'm not entire sure where this (using random slopes) comes from - it doesn't seem to arise in the 2 linked questions - but my problem with (1|subject) + (1|A:subject) is that this is exactly the same as (1|subject/A) which means that A is nested within subject, which in turn means (to me) that each level of A occurs in 1 and only 1 level of subject which clearly is not the case here.

I will probably add to and/or edit this answer after I've thought about it some more, but I wanted to get my initial thoughts down.

Solved – Random effects in repeated-measures design using lme

It appears that you have a case of a partially crossed, partially nested design, because if I understand correctly, day and cond are crossed (ie neither are nested in the other), while both appear to be nested within subject. measurement is an id variable that indexes the measurement occasion on each day and within each condition, and as such should not be treated as a random factor because there is only one observation of the dependent variable for each measurement occasion. Even though they are indexed as 1-4 for each day/condition, they are different measurements (that is, measurement 1 for day 1 condition 0 and measurement 1 for day 1 condition 1 are not the same measurement) and therefore there can be no random variation in it. If you specified it as random in the way you have coded the data above, it would be a mistake.

If this is the case, then lme is unable to fit such a model, and you could use something like lme4 instead. You could specify the structure in lme4 as follows:

DV ~ 1 + (1|subject) + (1|day) + (1|cond) + (1|subject:day) + (1|subject:cond)

If measurement is a measurement of time within each day or cond and you expect some temporal effect, then you could include measurement as a fixed effect (and also potentially fit random slopes, if the data supported such a model)

However, fitting a model with random intercepts for day and cond would not be a good idea because you have only 2 of each, so you would be asking the software to estimate a variance for a normally distributed variable having only 2 observations, which does not make any sense. So a better way forward is to treat day and cond as fixed effects, and simply fit random intercepts for subject:

DV ~ day + cond + (1|subject)

The fact that day and cond were randomly assigned is not relevant.

The same comment as above applies for measurement again here. That is, you might want to fit

DV ~ day + cond + measurement + (1|subject)

and again, you could also have random slopes for day and/or cond and/or measurement if suggested by the domain theory and supported by the data.

Of course, now that we have discarded day and cond as random, you can go back to the nlme package if you wish (athough lme4 is really the successor to nlme for most cases)

Best Answer

Related Solutions

Solved – Does it make sense for a fixed effect to be nested within a random one, or how to code repeated measures in R (aov and lmer)

Solved – Random effects in repeated-measures design using lme

Related Question