Solved – Nesting random effect within fixed effect using lmer() of lme4 in R

lme4-nlmemixed modelrrandom-effects-model

Problem
I want to fit a model using the R lme4 lmer function, and I'm not sure how to specify a random effect that is nested within a fixed effect.

Setup
I am applying a Treatment (fixed effect) to a subject, after which s/he is prompted to speak a word that uses exactly one of the 4 mandarin tones (Tone effect, fixed). Their response time, RT, is measured as the response variable.

Each tone is expressed 6 times, once via use of each of 6 different, predetermined words (Word effect, random). There is no overlap of words between tones (i.e. each word uses precisely one of the 4 tones, so there are a total of 24 words used).

Question
Below is the syntax I was thinking of for my model. Is this the right way to specify the described nesting of Word within Tone? In particular, do I need to specify the association Tone:Word as the random variable, or should I simply use Word by itself?

RT ~ 1 + Treatment*Tone + (1+ Treatment*Tone | Tone:Word)

Best Answer

  • I'm assuming that you have multiple subjects, that each subject gets exactly one treatment, and that each subject gets multiple words and tones (every subject getting all of the words and all of the tones is the cleanest design, but a somewhat unbalanced design will lose power but not mess things up too badly)
  • There is no way to determine whether the effect of tone varies across words, since each word uses a single tone, but you can tell whether the effect of tone varies across subject, since each subject gets multiple tones
  • The opposite situation holds for treatments: each word is observed under multiple treatments, but each subject is observed under a single treatment
  • Since (assuming) there is a single measurement per Subject-Word combination, you don't need/shouldn't use a (1|Subject:Word) term - it will be handled by the residual variance term
  • Since words are unique, you don't need to code word as explicitly nested within tone - see this answer
  • Since response time is an intrinsically positive variable you might want to consider log(RT) (for a linear model), or a GLMM with a Gamma response distribution (this depends on your data, though - the conditional distribution might be adequately Normal/homoscedastic)

Thus I think the maximal model (allowing for all interactions that can be estimated from the design) is

RT ~ Treatment*Tone+(Treatment|Word)+(Tone|Subject)

(you can include the implicit intercept terms, e.g. (1+Treatment|Word), if it's clearer for you, but you'll get the same model either way)

Note that there is some controversy about whether you should start with the maximal model, or whether you should try to cut the model down to something that is reasonable given the size of the experiment (in this case, the number of subjects is important - the variance-covariance matrix for tone effects is 4x4 (10 parameters), if you had fewer than 50 subjects you might not want to try that ...) Barr et al 2013 believe you should go with the maximal model, Bates et al. think you shouldn't ...

Related Question