The formula one needs to specify for training a multilevel model (using lmer
from lme4
R
library) always gets me. I have read countless textbooks and tutorials, but never properly understood it.
So here's an example from this tutorial that I would like to see formulated in an equation. We are trying to model voice frequency as a function of gender (females have higher pitched voice than males in general) and attitude of the person (whether he/she answered in a polite or informal manner) in different scenarios. Also, as you can see from the subject
column, each person was subjected to measurements several times.
> head(politeness, n=20)
subject gender scenario attitude frequency
1 F1 F 1 pol 213.3
2 F1 F 1 inf 204.5
3 F1 F 2 pol 285.1
4 F1 F 2 inf 259.7
5 F1 F 3 pol 203.9
6 F1 F 3 inf 286.9
7 F1 F 4 pol 250.8
8 F1 F 4 inf 276.8
9 F1 F 5 pol 231.9
10 F1 F 5 inf 252.4
11 F1 F 6 pol 181.2
12 F1 F 6 inf 230.7
13 F1 F 7 inf 216.5
14 F1 F 7 pol 154.8
15 F3 F 1 pol 229.7
16 F3 F 1 inf 237.3
17 F3 F 2 pol 236.8
18 F3 F 2 inf 251.0
19 F3 F 3 pol 267.0
20 F3 F 3 inf 266.0
subject
, gender
and attitude
are factors (with informal
and female
considered as base levels for attitude
and gender
in equations below). Now, one idea is to train a model with differing intercepts for each subject
and scenario
:
politeness.model=lmer(frequency ~ attitude + gender +
(1|subject) + (1|scenario), data=politeness)
If my understanding of the notation is correct, this corresponds to:
$y_i=a^1_{j[i]}+a^2_{k[i]}+\beta\cdot$ attitude
$_{\text{pol}_i} + \gamma\cdot$ gender
$_{\text{male}_i}$
where $i$ denotes $i^{th}$ data point, $j[i]$ denotes group level for subject
and $k[i]$ denotes group level for scenario
for $i^{th}$ data point. attitude
$_\text{pol}$ and gender
$_\text{male}$ are an binary indicators.
To introduce random slopes for attitude, we can write:
politeness.model = lmer(frequency ~ attitude + gender +
(1+attitude|subject) + (1+attitude|scenario), data=politeness)
Again, if my understanding is clear, this corresponds to:
$y_i = a^1_{j[i]} + a^2_{k[i]} + (\beta^1_{j[i]} + \beta^2_{k[i]})\cdot$ attitude
$_{\text{pol}_i} + \gamma\cdot$ gender
$_{\text{male}_i}$
Now, what equation does the following R
command correspond to?
politeness.null = lmer(frequency ~ gender +
(1+attitude|subject) + (1+attitude|scenario), data=politeness)
Best Answer
I would write
as
$$ y_i \sim \beta_0 + \beta_1 \cdot I(\textrm{attitude}=\textrm{pol}) + \beta_2 I(\textrm{gender}=\textrm{male}) + b_{1,j[i]} + b_{2,k[i]} + \epsilon_i \\ b_1 \sim N(0,\sigma^2_1) \\ b_2 \sim N(0,\sigma^2_2) \\ \epsilon \sim N(0,\sigma^2_r) $$ where $\beta$ indicates a fixed-effect coefficient, $b$ indicates a random variable, $I$ is an indicator function (this is basically the same as what you said above, just slightly different notation).
adds among-subject variation in response to
attitude
andscenario
(we could equivalently write the random-effects part as(attitude|subject) + (attitude|scenario)
, i.e. leaving the intercept implicit; this is a matter of taste). Now$$ y_i \sim \beta_0 + \beta_1 \cdot I(\textrm{attitude}=\textrm{pol}) + \beta_2 I(\textrm{gender}=\textrm{male}) + \\ b_{1,j[i]} + b_{3,j[i]} I(\textrm{attitude}=\textrm{pol}) + b_{2,k[i]} + b_{4,k[i]} I(\textrm{attitude}=\textrm{pol}) + \epsilon_i \\ \{b_1,b_3\} \sim \textrm{MVN}({\mathbf 0},\Sigma_1) \\ \{b_2,b_4\} \sim \textrm{MVN}({\mathbf 0},\Sigma_2) \\ \epsilon \sim N(0,\sigma^2_r) $$ where $\Sigma_1$ and $\Sigma_2$ are unstructured variance-covariance matrices, i.e. they are symmetric and positive (semi)definite but have no other constraints: $$ \Sigma_1 = \left( \begin{array}{cc} \sigma^2_1 & \sigma_{13} \\ \sigma_{13} & \sigma^2_3 \end{array} \right) $$ and similarly for $\Sigma_2$.
It might be instructive to group terms as follows: $$ y_i \sim (\beta_0 + b_{1,j[i]} + b_{2,k[i]}) + \\ ( \beta_1 + b_{3,j[i]} + b_{4,k[i]}) \cdot I(\textrm{attitude}=\textrm{pol}) + \beta_2 I(\textrm{gender}=\textrm{male}) + \epsilon_i $$ so you can see which random effects are affecting the intercept and which are affecting the response to attitude.
Now if you leave out the fixed-effect
attitude
term (i.e. set $\beta_1=0$, or drop theattitude
term from the formula) you can see (without rewriting everything) that, because the random effects are assumed to have zero mean, we will be assuming that the average response to attitude across subjects and scenarios will be exactly zero, while there is still variation among subjects and scenarios. I won't say this never makes sense from a statistical point of view, but it rarely does. There are discussions of this issue on the r-sig-mixed-models@r-project.org mailing list from time to time ... (or it may be discussed on StackExchange somewhere -- if not, it would make a good follow-up SE question ...)