Solved – Notation for multilevel modeling

lme4-nlmemultilevel-analysisr

The formula one needs to specify for training a multilevel model (using lmer from lme4 R library) always gets me. I have read countless textbooks and tutorials, but never properly understood it.

So here's an example from this tutorial that I would like to see formulated in an equation. We are trying to model voice frequency as a function of gender (females have higher pitched voice than males in general) and attitude of the person (whether he/she answered in a polite or informal manner) in different scenarios. Also, as you can see from the subject column, each person was subjected to measurements several times.

> head(politeness, n=20)
   subject gender scenario attitude frequency
1       F1      F        1      pol     213.3
2       F1      F        1      inf     204.5
3       F1      F        2      pol     285.1
4       F1      F        2      inf     259.7
5       F1      F        3      pol     203.9
6       F1      F        3      inf     286.9
7       F1      F        4      pol     250.8
8       F1      F        4      inf     276.8
9       F1      F        5      pol     231.9
10      F1      F        5      inf     252.4
11      F1      F        6      pol     181.2
12      F1      F        6      inf     230.7
13      F1      F        7      inf     216.5
14      F1      F        7      pol     154.8
15      F3      F        1      pol     229.7
16      F3      F        1      inf     237.3
17      F3      F        2      pol     236.8
18      F3      F        2      inf     251.0
19      F3      F        3      pol     267.0
20      F3      F        3      inf     266.0

subject, gender and attitude are factors (with informal and female considered as base levels for attitude and gender in equations below). Now, one idea is to train a model with differing intercepts for each subject and scenario:

politeness.model=lmer(frequency ~ attitude + gender + 
 (1|subject) + (1|scenario), data=politeness)

If my understanding of the notation is correct, this corresponds to:

$y_i=a^1_{j[i]}+a^2_{k[i]}+\beta\cdot$ attitude$_{\text{pol}_i} + \gamma\cdot$ gender$_{\text{male}_i}$

where $i$ denotes $i^{th}$ data point, $j[i]$ denotes group level for subject and $k[i]$ denotes group level for scenario for $i^{th}$ data point. attitude$_\text{pol}$ and gender$_\text{male}$ are an binary indicators.

To introduce random slopes for attitude, we can write:

politeness.model = lmer(frequency ~ attitude + gender + 
 (1+attitude|subject) + (1+attitude|scenario), data=politeness)

Again, if my understanding is clear, this corresponds to:

$y_i = a^1_{j[i]} + a^2_{k[i]} + (\beta^1_{j[i]} + \beta^2_{k[i]})\cdot$ attitude$_{\text{pol}_i} + \gamma\cdot$ gender$_{\text{male}_i}$

Now, what equation does the following R command correspond to?

politeness.null = lmer(frequency ~ gender +
 (1+attitude|subject) +  (1+attitude|scenario), data=politeness)

Best Answer

I would write

~ attitude + gender + (1|subject) + (1|scenario)

as

$$ y_i \sim \beta_0 + \beta_1 \cdot I(\textrm{attitude}=\textrm{pol}) + \beta_2 I(\textrm{gender}=\textrm{male}) + b_{1,j[i]} + b_{2,k[i]} + \epsilon_i \\ b_1 \sim N(0,\sigma^2_1) \\ b_2 \sim N(0,\sigma^2_2) \\ \epsilon \sim N(0,\sigma^2_r) $$ where $\beta$ indicates a fixed-effect coefficient, $b$ indicates a random variable, $I$ is an indicator function (this is basically the same as what you said above, just slightly different notation).

~ attitude + gender + (1+attitude|subject) + (1+attitude|scenario)

adds among-subject variation in response to attitude and scenario (we could equivalently write the random-effects part as (attitude|subject) + (attitude|scenario), i.e. leaving the intercept implicit; this is a matter of taste). Now

$$ y_i \sim \beta_0 + \beta_1 \cdot I(\textrm{attitude}=\textrm{pol}) + \beta_2 I(\textrm{gender}=\textrm{male}) + \\ b_{1,j[i]} + b_{3,j[i]} I(\textrm{attitude}=\textrm{pol}) + b_{2,k[i]} + b_{4,k[i]} I(\textrm{attitude}=\textrm{pol}) + \epsilon_i \\ \{b_1,b_3\} \sim \textrm{MVN}({\mathbf 0},\Sigma_1) \\ \{b_2,b_4\} \sim \textrm{MVN}({\mathbf 0},\Sigma_2) \\ \epsilon \sim N(0,\sigma^2_r) $$ where $\Sigma_1$ and $\Sigma_2$ are unstructured variance-covariance matrices, i.e. they are symmetric and positive (semi)definite but have no other constraints: $$ \Sigma_1 = \left( \begin{array}{cc} \sigma^2_1 & \sigma_{13} \\ \sigma_{13} & \sigma^2_3 \end{array} \right) $$ and similarly for $\Sigma_2$.

It might be instructive to group terms as follows: $$ y_i \sim (\beta_0 + b_{1,j[i]} + b_{2,k[i]}) + \\ ( \beta_1 + b_{3,j[i]} + b_{4,k[i]}) \cdot I(\textrm{attitude}=\textrm{pol}) + \beta_2 I(\textrm{gender}=\textrm{male}) + \epsilon_i $$ so you can see which random effects are affecting the intercept and which are affecting the response to attitude.

Now if you leave out the fixed-effect attitude term (i.e. set $\beta_1=0$, or drop the attitude term from the formula) you can see (without rewriting everything) that, because the random effects are assumed to have zero mean, we will be assuming that the average response to attitude across subjects and scenarios will be exactly zero, while there is still variation among subjects and scenarios. I won't say this never makes sense from a statistical point of view, but it rarely does. There are discussions of this issue on the r-sig-mixed-models@r-project.org mailing list from time to time ... (or it may be discussed on StackExchange somewhere -- if not, it would make a good follow-up SE question ...)

Related Question