I am reading a bit about mixed models but I am unsure over the terms used and quite how they go together.
Pinheiro, on pg 62 of his book 'Mixed-effects models in S and S-Plus', describes the likelihood function.
The first term of the second equation is described as the conditional density of $\mathbf{y_i}$,
and the second the marginal density of $\mathbf{b_i}$.
I have been trying to generate these log-likelihoods (ll) for simple random effect models, as I thought it would help my understanding, but I must be misinterpreting the derivations.
An example of where I have tried to calculate the ll's.
example model
library(lme4)
model <- lmer(angle ~ temp + (1|replicate), data=cake, REML=FALSE)
conditional log-likelihood
I try to calculate the conditional log-likelihood for this model: it seems from $p(y_i \mid b_i, \beta, \sigma^2)$
that I should be able to calculate this by finding the density of the data at the predictions at the unit/replicate level.
sum(dnorm(cake$angle,
predict(model), # predictions at replicate unit (XB + Zb) level
sd=sigma(model),
log=TRUE))
#[1] -801.6044
# Which seems to agree with
cAIC4::cAIC(model)$loglikelihood
# [1] -801.6044
# or should we really be using a multivariate normal density
# but doesn't make a difference as variance is \sigma^2 I_n
dmvnorm(cake$angle, predict(model), diag(sigma(model)^2,270, 270), log=TRUE)
#[1] -801.6044
marginal log-likelihood
I try to calculate the marginal log-likelihood, which lme4
gives as
logLik(model)
#'log Lik.' -834.1132 (df=4)
Taking a similar approach as before it seems from $p(y_i \mid \beta, \theta, \sigma^2)$
that I should be able to calculate this by finding the density of the data at the predictions at the population level, but it is not close.
sum(dnorm(cake$angle,
predict(model, re.form=NA), # predictions at population (XB) level
sd=sigma(model),
log=TRUE))
# [1] -1019.202
So perhaps I need to use the second equation and need to use the conditional model for y and the marginal for b, but still not close.
sum(
dnorm(cake$angle, predict(model), sd=sigma(model), log=TRUE) , # conditional
dnorm(0, ranef(model)$replicate[[1]], # RE predictions
sd=sqrt(VarCorr(model)$replicate), log=TRUE)
)
# [1] -849.6086
Edit: Next go…
For the linear mixed model $Y = X\beta + Zb + \epsilon$, $b_i \sim N(0, \psi)$ and $e_i \sim N(0, \Sigma)$ I thought perhaps the variance for the likelihood calculation should be estimated as var(Y) = $Z\psi Z^T + \Sigma$, but wrong again!
So in code
z = getME(model, "Z")
zt = getME(model, "Zt")
psi = bdiag(replicate(15, VarCorr(model)$replicate, simplify=FALSE))
betw = z%*%psi%*%zt
err = Diagonal(270, sigma(model)^2)
v = betw + err
sum(dnorm(cake$angle,
predict(model, re.form=NA),
sd=sqrt(diag(v)),
log=TRUE))
# [1] -935.652
My question:
- Can you tell me where I went wrong in calculating the marginal likelihood. I am not really needing code to reproduce the ll but more a description of why the approach I tried does not work.
Many thanks.
PS. I did look at the function that generates these values, lme4:::logLik.merMod
and lme4:::devCrit
and see the authors use some difficult / technical approach, which again leads to me needing help why my approach is wrong.
Best Answer
I can reproduce the marginal log-likelihood returned by
lme4:::logLik.merMod
by realising that the marginal distribution of Y is multivariate normal (MVN) (as the marginal distribution of b and the conditional of Y are MVN).So this code will reproduce
where
cake$angle
are the observations,predict(model, re.form=NA)
are the population predictions (calculated using the fixed effect coefficients), andv
is the variance of the marginal model (as shown in question).A couple of comments on the failing efforts in my question.
When calculating the conditional log-likelihood I used the univariate normal density function, whereas I could/should of used the multivariate. In this case it doesn't make a difference as within unit variance is $\sigma^2 I_n$
Trying to calculate the marginal distribution
First try used the predictions at population (XB) level but the incorrect variance (it ignored the between unit error)
Second try was just nonsense I think
Third try uses the univariate rather than multivariate normal distribution.