I would like to estimate the conditional mean of counts assuming a negative binomial distribution.
I can estimate the unconditional mean using MASS's glm.nb function:
fit <- MASS::glm.nb(Days ~ 1, data = quine)
mu <- exp(coef(fit)[1])
size <- fit$theta
And I can verify the parameter estimates, again using the MASS package:
MASS::fitdistr(quine$Days, 'negative binomial')
For the unconditional probability of a specific count, I can use mu and size in dnbinom
dnbinom(5, size=size, mu=mu) # ~ .04
But when I model the conditional mean using something like:
fit <- glm.nb(Days ~ Sex + Age, data = quine)
mu <- exp(coef(fit)[1])
size <- fit$theta
I'm not sure how to get the conditional parameter estimates. I know that exp(predict(fit)) will give me the conditional mean (mu) for each i, but is there a way to get the conditional theta from the model? And what does the theta in the fit mean? It is not equal to the unconditional fit's theta from the earlier unconditional estimate.
Best Answer
The
glm.nb()
function inMASS
uses the following parameterization: $$ \begin{eqnarray*} y_i & \sim & \mathrm{NB}(\mu_i, \theta) \quad (i = 1, \dots, n)\\ \log(\mu_i) & = & x_i^\top \beta \end{eqnarray*} $$ Thus, only the expectation $\mu_i$ depends on the regressors. The size parameter $\theta$ is constant for all $n$ observations.To evaluate the predicted distribution you can still use
dnbinom(5, size = size, mu = mu)
wheresize
is just a scalar value andmu
a vector.