Solved – R glmer.nb output. How to get $\hat{\theta}$

generalized linear modelmixed modelnegative-binomial-distributionrregression

I would like to obtain estimated $\theta$ from glmer.nb function in lme4 package. In my understanding this function fits the model:
$$
Y_{ij}|\boldsymbol{B}_{i}=\boldsymbol{b}_i \overset{ind.}{\sim} NB\Big(mean=\mu,var=\mu + \frac{\mu^2}{\theta}\Big)
$$
where $NB$ refers to the negative binomial distribution and:
$$
\mu = \exp(\boldsymbol{X}_{ij}^T \boldsymbol{\beta} + \boldsymbol{Z}_{ij}^T \boldsymbol{b}_i)
$$
and
$$
\boldsymbol{B}_i \overset{i.i.d.}{\sim} MVN(mean=\boldsymbol{0},var=\Sigma).
$$
So glmer.nb must be estimating unknown parameters $\boldsymbol{\beta}$, $\Sigma$ and $\theta$ via maximizing its likelihood. The help file of glmer.nb little explains its functionality, and it says "glmer() for Negative Binomial". However, the negative binomial is NOT an exponential family when the parameter $\theta$ is unknown. So $\theta$ must be estimated in some other ways that generalized linear mixed effect models (GLMM) do not take, and the estimated $\theta$ must be obtained in some special ways that GLMM do not take. How can I access to the estimate of $\theta$ which should be one of glmer.nb output?

Best Answer

Looking at the source code, this isn't something that is trivially exposed. Essentially, glmer.nb is (from github sources):

glmer.nb <- function(..., interval = log(th)+c(-3,3), verbose=FALSE)
{
  g0 <- glmer(..., family=poisson)
  th <- est_theta(g0)
  if(verbose) cat("th := est_theta(glmer(..)) =", format(th),"\n")
  g1 <- update(g0, family = negative.binomial(theta=th))
  ## if (is.null(interval)) interval <- log(th)+c(-3,3)
  optTheta(g1, interval=interval, verbose=verbose)
}

Which essentially is:

  1. fit a standard Poisson GLMM
  2. estimate $\theta$ using maximum likelihood estimation given this standard Poisson fit. This uses MASS::theta.ml()
  3. refit the model using this value of $\theta$ and the negative.binomial() family function which needs a fixed $\theta$,
  4. last line does an optimization to find $\theta$ given the updated fit, and that function throws an error if the original estimate from 2. and the optimised one from 4. don't match.

    the last line performs 1-d optimisation on the interval log(theta) + c(-3, 3) via optimize() to find the value of $\theta$ that minimises the deviance of the negative binomial GLMM.

That said, you might try:

lme4:::getNBdisp(fit)

where fit is the model returned by glmer.nb. This is just a guess based on reading the code but it is used by the authors in tests etc.

Related Question