Solved – Theta estimation in negative binomial regression

count-datanegative-binomial-distributionregression

I would like to estimate the conditional mean of counts assuming a negative binomial distribution.

I can estimate the unconditional mean using MASS's glm.nb function:

fit <- MASS::glm.nb(Days ~ 1, data = quine)
mu <- exp(coef(fit)[1])
size <- fit$theta

And I can verify the parameter estimates, again using the MASS package:

MASS::fitdistr(quine$Days, 'negative binomial')

For the unconditional probability of a specific count, I can use mu and size in dnbinom

dnbinom(5, size=size, mu=mu)  # ~ .04

But when I model the conditional mean using something like:

fit <- glm.nb(Days ~ Sex + Age, data = quine)
mu <- exp(coef(fit)[1])
size <- fit$theta

I'm not sure how to get the conditional parameter estimates. I know that exp(predict(fit)) will give me the conditional mean (mu) for each i, but is there a way to get the conditional theta from the model? And what does the theta in the fit mean? It is not equal to the unconditional fit's theta from the earlier unconditional estimate.

Best Answer

The glm.nb() function in MASS uses the following parameterization: $$ \begin{eqnarray*} y_i & \sim & \mathrm{NB}(\mu_i, \theta) \quad (i = 1, \dots, n)\\ \log(\mu_i) & = & x_i^\top \beta \end{eqnarray*} $$ Thus, only the expectation $\mu_i$ depends on the regressors. The size parameter $\theta$ is constant for all $n$ observations.

To evaluate the predicted distribution you can still use dnbinom(5, size = size, mu = mu) where size is just a scalar value and mu a vector.

Related Solutions

Solved – the relationship between theta and size in negative binomial distribution

The $theta from a fitted glm.nb() corresponds to the size in dnbinom(). As a simple example, let's replicate the fitted log-likelihood from scratch. Using the quine data from MASS:

library("MASS")
m <- glm.nb(Days ~ ., data = quine)
logLik(m)
## 'log Lik.' -546.5755 (df=8)

And this value of the log-likelihood can be obtained by summing the dnbinom(..., log = TRUE) values:

sum(dnbinom(quine$Days, mu = fitted(m), size = m$theta, log = TRUE))
## [1] -546.5755

Doubling the weight of all observations leaves all parameter estimates (including theta) unchanged:

quine$weights <- 2      
    m2 <- glm.nb(Days ~ ., data = quine, weights = weights)
    m$theta
## [1] 1.274893
m2$theta
[1] 1.274893

Solved – Negative binomial regression in R allowing for correlation between dispersion & regression coefficients

I haven't found another R package which does this, but I have written code which, based on the maximum likelihood estimates of a model fitted with glm.nb, calculates the full variance covariance matrix using the observed information matrix.

Comparing to values from SAS this appears to match, but if anyone spots an error or finds that it does not match the variance covariance matrix from SAS or Stata, please add a comment to this answer.

glm.nb.cov <- function(mod) {
  #given a model fitted by glm.nb in MASS, this function returns a variance covariance matrix for the
  #regression coefficients and dispersion parameter, without assuming independence between these
  #note that the model must have been fitted with x=TRUE argument so that design matrix is available

  #formulae based on p23-p24 of http://pointer.esalq.usp.br/departamentos/lce/arquivos/aulas/2011/LCE5868/OverdispersionBook.pdf
  #and http://www.math.mcgill.ca/~dstephens/523/Papers/Lawless-1987-CJS.pdf

  k <- mod$theta
  #p is number of regression coefficients
  p <- dim(vcov(mod))[1]

  #construct observed information matrix
  obsInfo <- array(0, dim=c(p+1, p+1))

  #first calculate top left part for regression coefficients
  for (i in 1:p) {
    for (j in 1:p) {
      obsInfo[i,j] <- sum( (1+mod$y/mod$theta)*mod$fitted.values*mod$x[,i]*mod$x[,j] / (1+mod$fitted.values/mod$theta)^2  )
    }
  }

  #information for dispersion parameter
  obsInfo[(p+1),(p+1)] <- -sum(trigamma(mod$theta+mod$y) - trigamma(mod$theta) -
                                 1/(mod$fitted.values+mod$theta) + (mod$theta+mod$y)/(mod$theta+mod$fitted.values)^2 - 
                                 1/(mod$fitted.values+mod$theta) + 1/mod$theta)

  #covariance between regression coefficients and dispersion
  for (i in 1:p) {
    obsInfo[(p+1),i] <- -sum(((mod$y-mod$fitted.values) * mod$fitted.values / ( (mod$theta+mod$fitted.values)^2 )) * mod$x[,i] )
    obsInfo[i,(p+1)] <- obsInfo[(p+1),i]
  }

  #return variance covariance matrix
  solve(obsInfo)
}

Best Answer

Related Solutions

Solved – the relationship between theta and size in negative binomial distribution

Solved – Negative binomial regression in R allowing for correlation between dispersion & regression coefficients

Related Question