I recently measured how the meaning of a new word is acquired over repeated exposures (practice: day 1 to day 10) by measuring ERPs (EEGs) when the word was viewed in different contexts. I also controlled properties of the context, for instance, its usefulness for the discovery of new word meaning (high vs. low). I am particularly interested in the effect of practice (days). Because individual ERP recordings are noisy, ERP component values are obtained by averaging over the trials of a particular condition. With the `lmer`

function, I applied the following formula:

```
lmer(ERPindex ~ practice*context + (1|participants), data=base)
```

and

```
lmer(ERPindex ~ practice*context + (1+practice|participants), data=base)
```

I've also seen the equivalent of the following random effects in the literature:

```
lmer(ERPindex ~ practice*context + (practice|participants) +
(practice|participants:context), data=base)
```

What is accomplished by using a random factor of the form `participants:context`

? Is there a good source that would allow someone with just cursory knowledge of matrix algebra understand precisely what random factors do in linear mixed models and how they should be selected?

## Best Answer

I'm going to describe what model each of your calls to

`lmer()`

fits and how they are different and then answer your final question about selecting random effects.Each of your three models contain fixed effects for

`practice`

,`context`

and the interaction between the two. The random effects differ between the models.contains a random intercept shared by individuals that have the same value for

`participants`

. That is, each`participant`

's regression line is shifted up/down by a random amount with mean $0$.This model, in addition to a random intercept, also contains a random slope in

`practice`

. This means that the rate at which individuals learn from practice is different from person to person. If an individual has a positive random effect, then they increase more quickly with practice than the average, while a negative random effect indicates they learn less quickly with practice than the average, or possibly get worse with practice, depending on the variance of the random effect (this is assuming the fixed effect of practice is positive).This model fits a random slope and intercept in

`practice`

(you have to do`(practice-1|...)`

to suppress the intercept), just as the previous model did, but now you've also added a random slope and intercept in the factor`participants:context`

, which is a new factor whose levels are every combination of the levels present in`participants`

and`context`

and the corresponding random effects are shared by observations that have the same value of both`participants`

and`context`

. To fit this model you will need to have multiple observations that have the same values for both`participants`

and`context`

or else the model is not estimable. In many situations, the groups created by this interaction variable are very sparse and result in very noisy/difficult to fit random effects models, so you want to be careful when using an interaction factor as a grouping variable.Basically (read: without getting too complicated) random effects should be used when you think that the grouping variables define "pockets" of inhomogeneity in the data set or that individuals which share the level of the grouping factor should be correlated with each other (while individuals that do not should not be correlated) - the random effects accomplish this. If you think observations which share levels of both

`participants`

and`context`

are more similar than the sum of the two parts then including the "interaction" random effect may be appropriate.Edit:As @Henrik mentions in the comments, the models you fit, e.g.:make it so that the random slope and random intercept are correlated with each other, and that correlation is estimated by the model. To constrain the model so that the random slope and random intercept are uncorrelated (and therefore independent, since they are normally distributed), you'd instead fit the model:

The choice between these two should be based on whether you think, for example,

`participant`

s with a higher baseline than average (i.e. a positive random intercept) are also likely to have a higher rate of change than average (i.e. positive random slope). If so, you'd allow the two to be correlated whereas if not, you'd constrain them to be independent. (Again, this example assumes the fixed effect slope is positive).