As you've described the study, trial is nested within block, but block isn't nested within subject. That is, trial 3 is a different question in blocks 1 and 2, but block 3 is the same set of 8 questions for each subject. Hence, a natural way to structure the random effects would be to have one random intercept effect per subject plus 8N random intercepts nested into N batches of 8, where N is the number of blocks. Or, if N is small, you could treat block as a fixed effect and have a single batch of 8N per-trial random intercepts (plus the aforementioned per-subject intercepts).
You asked what the difference is between fancy random-effects structures like these and Cartesian-producting all the dummy variables in a study together to get one big batch of random effects (new variable
). The difference is that each batch of random effects has its variance estimated separately, and that orthogonal effects are obliged to behave consistently. (And, of course, the more random effects you have, the harder it is to estimate each.) To use a simpler example, imagine you have a model where each subject is a child and you have dummy variables for the child's father and mother. Assume the dataset has a lot of half-siblings in it, so that mother and father effects are distinguishable. If you say
lmer(outcome ~ 1 + fixed effects + (1|Mother) + (1|Father))
then the model is allowed to believe, e.g., that the effects of father vary more than the effects of mothers. On the other hand, if you make each mother–father pair its own value of a single dummy variable, and say
lmer(outcome ~ 1 + fixed effects + (1|new variable))
then new variable
gets only one variance. Also, whereas this model allows for arbitrarily complicated interactions between mother and father, the first model postulates that the effects are purely additive. And if $M$ is the number of mothers and $F$ the number of fathers, the first model has $M + F$ different random effects and the second has $MF$.
Finally, I don't think it's wise to consider RT
and Correct
in completely separate models. Shouldn't whether people answer a question correctly be related to how quickly they answer it?
Best Answer
Thanks to @amoeba and using @BenBolker's brief remark here
we got to the bottom of the problem.
The solution is following:
The
summary(model)
in random effects yields:which now makes perfect sense, because only correlations that can be calculated are between intercepts and slopes for a certain condition (because they are estimated for the same trials). Correlations between Conditions are senseless.
Furthermore,
coef(model)$Trial
now shows logical values:N.B. When specifying random effects for this purpose, it is important to:
|Trial
. If you don't do that,lme4
will estimate random effects for all trials, not just for the given condition.lme4
from including a general random intercept across all trials.