I'm trying to understand how the equations (respecting indices) and the variance and covariance matrices of some mixed models adjusted in R are expressed, which are initially expressed by the lme4
and nlme
packages.
The first model is a model of just one random effect on the intercept, this one doesn't seem difficult to me.
Adjusted for example as follows in lme4
and nlme
respectively:
# Var6 it's a qualitative variable.
# Var7 it's a qualitative variable.
modA_lmer <- lmer(log(Var1)~log(Var2)+log(Var3)+
(Var4)+(Var5)+(1|Var6), data)
modA_lme <- lme(log(Var1)~log(Var2)+log(Var3)+
(Var4)+(Var5),
random = ~1|Var6, data)
modB_lmer <- lmer(log(Var1)~log(Var2)+log(Var3)+
(Var4)+(Var5)+(1|Var7), data)
modB_lme <- lme(log(Var1)~log(Var2)+log(Var3)+
(Var4)+(Var5),
random = ~1|Var7, data)
modC_lmer <- lmer(log(Var1)~log(Var2)+log(Var3)+
(Var4)+(Var5)+(1|Var6)+(1|Var7), data)
modC_lme <- lme(log(Var1)~log(Var2)+log(Var3)+
(Var4)+(Var5),
random= list(Var8= pdIdent(form = ~ 0 + as.factor(Var6)),
Var7 = ~1), data)
I'm not sure (would it be possible for someone to check? Mainly indexes.), but I think that the equation of the three models, respecting the indices, would be given by:
I believe that the variance and covariance matrices D of the random effects would be expressed by
Ok, assuming that everything described above is correct, now imagine the mixed model B in a heterocedastic case, assuming different variances per level of a dummy variable Var 8
, which adjusted through the lme4
and nlme
packages would look like:
# Var 8 it's a dummy variable
modD_lmer <- lmer(log(Var1)~log(Var2)+log(Var3)+
(Var4)+(Var5)+(Var8-1||Var6), data)
modD_lme <- lme(log(Var1)~log(Var2)+log(Var3)+
(Var4)+(Var5),
random = list(Var6=pdDiag(~Var8)), data)
In this case, what would the equation of my model look like? I thought of the following situation:
Is this correct? The equation would be exactly the same but the difference would be the introduction of an index in the sigma^2 of the error?
The other question is, well, in this case, how would the matrix of variances and covariances look like? Would I have three sigmas in the diagonal matrix?
I.e:
I found some information in :
Best Answer
There's a lot going on here.
Models A,B,C
Models A,B,C are fine in both R-Code and formula, however i have some notes.
Var6
andVar7
are crossed random effects, i.e. their groupings are different/ not nested (also see here: Crossed vs nested random effects: how do they differ and how are they specified correctly in lme4?), which means that it is not sensible to speak about their covariance matrix, because they don't "realize" on the same objects.lme
does not inherently support crossed random effects, but you seem to have used the approach from here: https://stats.stackexchange.com/a/552180/341520 formodC_lme
, which relies on theVar8
being a second intercept column, i.e. all 1s.equatiomatic
package which can create a Latex-formula from mostlmer
-objects. See the demonstration below.$$ \begin{aligned} \operatorname{yC}_{i} &\sim N \left(\alpha_{j[i],k[i]} + \beta_{1}(\operatorname{\log(x1)}) + \beta_{2}(\operatorname{x2}), \sigma^2 \right) \\ \alpha_{j} &\sim N \left(\mu_{\alpha_{j}}, \sigma^2_{\alpha_{j}} \right) \text{, for var7 j = 1,} \dots \text{,J} \\ \alpha_{k} &\sim N \left(\mu_{\alpha_{k}}, \sigma^2_{\alpha_{k}} \right) \text{, for var6 k = 1,} \dots \text{,K} \end{aligned} $$
Model D
Here you seem to have gotten confused reading https://rpubs.com/bbolker/factorvar, because your formula describes the simple case of varying variance in $\epsilon_{ijk}$, while your R-Code tries to solve the difficult case of varying variance in $u_i$. Both are correct, they just don't fit together.
Adjusting the formula to the hard case I'd propose writing $u_{i,l}\sim \mathcal N(0, \sigma_{1, l}),\epsilon_{i,l,k}\sim \mathcal N(0, \sigma_3) $, while if your formula is correct we can use weights argument in
lme
just as Ben Bolker did in your source:In
rnorm(n, sd = ifelse(var8 == "1", 3, 6))
we defined $\sigma_{3, 1} = 3$ and $\sigma_{3, 2} = 6$, which correspond to the residual variance of $\hat\sigma_{3, 2} = 5.716438$ and the parameters underVariance function
show that this for group $l = 2$ while group $l = 1$ has roughly half that standard deviation. So it fits great!lme4
has no comparable functionality see here: https://bbolker.github.io/mixedmodels-misc/glmmFAQ.html#spatial-and-temporal-correlation-models-heteroscedasticity-r-side-models Ben Bolkers GLMM-FAQ is frankly always worth reading, when dealing with mixed models.A covariance matrix for the random effects is still not sensible.