I am trying to understand how to write mathematically a mixed effect model.
The model is (I want to check the growth rate of rats in a different countries across the variable day adding the interaction of the 2 factor level Country to the growth rate):
lmer(Volume ~ Country * Day + (1 | Rat.ID))
In R I extracted the equation by using the function extract_eq() for broom.mixed package, obtaining:
$$
\begin{aligned}
\operatorname{lVolume}_{i} &\sim N \left(\alpha_{j[i]} + \beta_{1}(\operatorname{Day}), \sigma^2 \right) \\
\alpha_{j} &\sim N \left(\gamma_{0}^{\alpha} + \gamma_{1}^{\alpha}(\operatorname{Country}_{\operatorname{England}}) + \gamma_{2}^{\alpha}(\operatorname{Day} \times \operatorname{Country}_{\operatorname{England}}), \sigma^2_{\alpha_{j}} \right)
\text{, for Rat.ID j = 1,} \dots \text{,J}
\end{aligned}
$$
But I don't think it is correct since it's a bit confusing the $i$ and $j$ and the interaction coefficient and the meaning of the $\gamma$ variable.
So my first attempt was I considering the effect cofficents and the interaction term :
$$ Y = \alpha + \beta_1*Day + \beta_2*Day*Country$$
Update
I tried to understand the model again, and arrived to this:
$$ y_{ij} = \beta_0 + \beta_{0CountryA} + 1I_{Country_i = CountryA}+
\beta_{1CountryA} \times Day_{ij} \times 1I_{Country_i = CountryA}+
\beta_{1CountryB} \times Day_{ij} \times 1I_{Country_i = CountryB}+
\eta_{i0} +e_{ij}$$
where every rat i and day j with observations y (Volume).
Is this correct?
But still, here I don't have a representation of the model in terms of expected values and variances.
Could I have some help/hints to understand this, please?
Thanks
Best Answer
The confusion in interpreting the coefficients reported by the
extract_eq()
function of theequatiomatic
package probably comes from the need of the package to handle a large number of model structures. The symbols reported by theextract_eq()
function can be seem unnecessarily complicated in this simple case, but the function needs to handle a wide variety of models.The main trick here is to recognize that only the rats, represented by $j$ in the
extract_eq()
output, have random effects in your case. Any term that omits $j$ is fundamentally part of the fixed-effect model, even ifextract_eq()
happens to represent it as part of a random effect. Any term independent of $j$ in the mean value presented byextract_eq()
as part of a random effect can be moved out of the random effect and into the fixed-effect part of the model.Let's start with your own "first attempt" model but write out all the terms and adjust the symbols of some coefficients to show how they correspond to the output from
extract_eq()
:$$ Y_i = (\gamma_0^{\alpha} +\alpha_{i}) + \beta_1 D_i + \gamma_1^{\alpha} C_i + \gamma_2^{\alpha} D_i C_i + \epsilon_i$$
where $i$ is an index for the individual observations , $\gamma_0^{\alpha}$ is an overall intercept, $\alpha_{i}$ is the extra intercept to be modeled as a random effect (depending on which rat $j$ is being observed in observation $i$, or $j[i]$), $D$ represents Day, $C$ is Country (0 for the reference country and 1 for the other choice, England), and $\epsilon$ is random error. The $\epsilon$ values are assumed to be distributed normally with the variance $\sigma^2$ reported in the first term by
extract_eq()
.For now, ignore the superscript $\alpha$ on some of the coefficients. That's just for identification of where the coefficients come from (explained below).
In your mixed-effects model, the random effects will all be incorporated into $\alpha_i$. The value for observation $i$ depends on the rat $j$ being observed;
extract_eq()
represents that as $\alpha_{j[i]}$.The mean value in the first term reported by
extract_eq()
$$N \left(\alpha_{j[i]} + \beta_{1}(\operatorname{Day}), \sigma^2 \right) $$
includes a term, $\beta_{1}(\operatorname{Day})$, that is independent of the (rat) random effect indexed by $j$ and can thus be removed into the fixed-effect part of the model, as I did above to give the $\beta_1 D_i$ term. Furthermore, in your case, $\sigma^2$ is independent of $j$ and thus is part of the model for the fixed effects.
Although the coefficients in the output from
extract_eq()
look complicated, that's because they include information about which higher-level coefficients are being modeled (allowing the function to handle more complicated models). For example, the $\alpha$ superscript in the $\gamma_1^{\alpha}$ coefficient forCountry
just means that this $\gamma$ coefficient is part of the model for the overall model intercept $\alpha$; similar interpretation holds for the superscript $\alpha$ in $\gamma_0^{\alpha}$ and $\gamma_2^{\alpha}$.In your particular case, the $\gamma_0^{\alpha}$, $\gamma_1^{\alpha}$ and $\gamma_2^{\alpha}$ coefficients are constants independent of rat $j$, so you can remove their associated terms from the mean value $$\gamma_{0}^{\alpha} + \gamma_{1}^{\alpha}(\operatorname{Country}_{\operatorname{England}}) + \gamma_{2}^{\alpha}(\operatorname{Day} \times \operatorname{Country}_{\operatorname{England}})$$ reported by
extract_eq()
for $\alpha_{j}$ and put them into the fixed-effect part of the model. I did that above to provide the $\gamma_0^{\alpha}$, $\gamma_1^{\alpha} C_i$ and $\gamma_2 D_iC_i$ terms in my proposed model.That leaves only the following for modeling the random effect $\alpha_{j}$:
$$\alpha_j \sim N(0,\sigma_{\alpha_j}^2).$$
That's just a normal distribution with mean 0 and variance $\sigma_{\alpha_j}^2$. So in the model for $Y_i$ that I proposed initially, the only random effect is $\alpha_{i}$. For the rat $j$ evaluated in observation $i$ ($j[i]$), that has the specific value $\alpha_{j[i]}$.
So in the model I initially wrote, the fixed effects are: $\gamma_0^{\alpha}$ for the expected value when $D=0$ (
Day
of 0) and $C=0$ (the referenceCountry
); $\beta_1$ is the extra change perDay
when $C=0$; $\gamma_1^{\alpha}$ is the extra contribution ofCountry
England over the referenceCountry
atDay= 0
; $\gamma_2^{\alpha}$ is the extra contribution of $D$ and $C$ whenDay
isn't 0 andCountry
is England. The random intercept effects $\alpha_j$ specific to individual rats $j$ have a variance $\sigma_{\alpha_j}^2$. Those random effects are thus for hypothetical differences about the overall intercept $\gamma_0^{\alpha}$ among rats atDay
0 in the referenceCountry
. Finally, there is additional variance $\sigma^2$ about all of those individual contributions to the model.