Solved – Build confidence intervals for random effects in intercept only models

confidence intervallme4-nlmemixed model

I'm building a Poisson model for a rate between an outcome and an offset (to be precise between the observed and predicted values of a phenomenon). I want to get this rate + confidence intervals (CI) for each observation group. This is the model:

mod <- glmer(observed ~ 1 + offset(log(preds)) + (1 | group), 
      family = poisson(), DF)

What should I do to get the rate (not exponentiated) for each group?
I know about the dotplot() function to plot the conditional means and confidence intervals of the groups, but I can't understand how it works and I would like to have the raw estimates to do my own plots.

I tried extracting the mean and standard error for each group using ranef() and than approximate a CI:

est = ranef(mod, condVar = TRUE)$group
se = attr(ranef(mod, condVar = TRUE)$group, 'postVar')[,,]

upr = est + 1.96 * se
lwr = est - 1.96 * se

A second approach I used was through empirical Bayes simulation:

s <- sim(mod, n.sims = 1000)

lapply(group_names, function(g_name) {
    q <- quantile(s@ranef$h_code[,g_name,], c(0.025, .0975))
data.frame(group = code, lwr = q[1], upr = q[2], est = ranef(mod)$group[g_name,])
}) %>% bind_rows

The CI of the first method are way shorter than in the second case and both look different than those produced by the dotplot() function.
Furthermore, I have not clear whether I should use ranef() estimates as they are or join them somehow with fixef() estimates.

I know how to solve this problem using full Bayesian estimation eg. using rstanarm. I was wondering if there was a faster approximate solution using the lme4 framework

Best Answer

When you are interested in predictions conditional on the random effects, to my view it is easier to work with the hiearachical formulation of the mixed model that has a intrinsically Bayesian flavor. In particular, in your specific case, your are interested in the mean of the Poisson model conditional on the random effects, i.e., $$\mu_i = \exp(\beta + b_i),$$ with $i$ denoting the group, $\beta$ the fixed effect intercept, and $b_i$ the random intercept. You can derive a confidence interval for $\mu_i$ by using the following simulation scheme:

  • Step I: Simulate a value $\theta^*$ from the approximate posterior distribution $\mathcal N(\hat\theta, \hat\Sigma)$, where $\hat\theta$ denotes the maximum likelihood estimatates for $\beta$ and $\sigma_b$ with $\sigma_b$ denoting the standard deviation of the random intercepts term $b_i$, and $\hat\Sigma$ the variance of $\hat\theta$.
  • Step II: Simulate a value $b_i^*$ from the posterior distribution of the random effects $[b_i \mid y_i, \theta^*]$, where $y_i$ denotes the outcome data for group $i$ (note we condition on $\theta^*$ from the previous step).
  • Step III: Calculate $\mu_i^* = \exp(\beta^* + b_i^*)$.

Step I accounts for the sampling variability of the maximum likelihood estimates, and Step II for the variability in the random effects.

Repeating Steps I-III $L$ times, you obtain a Monte Carlo sample for $\mu_i$ based on which you could obtain a 95% CI using the 2.5% and 97.5% percentile.

This procedure is implemented in the predict() method for mixed models fitted using the GLMMadaptive package. For an example, check the vignette Methods for MixMod Objects.

Related Question