Solved – the relationship between theta and size in negative binomial distribution

negative-binomial-distribution

In negative binomial regression glm.nb(y~x), I got a parameter theta and two coefficients? And then I want to use dnbinom(x, size, prob, mu, log = FALSE) to calculate the predicted probability.

can anyone show me what is the relationship between theta and size, and how to transfer between these two parameters?

If every point's value in a dataset is expanded two times, how does size or theta change?

Best Answer

The $theta from a fitted glm.nb() corresponds to the size in dnbinom(). As a simple example, let's replicate the fitted log-likelihood from scratch. Using the quine data from MASS:

library("MASS")
m <- glm.nb(Days ~ ., data = quine)
logLik(m)
## 'log Lik.' -546.5755 (df=8)

And this value of the log-likelihood can be obtained by summing the dnbinom(..., log = TRUE) values:

sum(dnbinom(quine$Days, mu = fitted(m), size = m$theta, log = TRUE))
## [1] -546.5755

Doubling the weight of all observations leaves all parameter estimates (including theta) unchanged:

quine$weights <- 2      
    m2 <- glm.nb(Days ~ ., data = quine, weights = weights)
    m$theta
## [1] 1.274893
m2$theta
[1] 1.274893

Related Solutions

Solved – How to interpret coefficients of categorical predictors in the negative binomial regression model

I'm going to answer this using a Poisson model, which is precisely a negative binomial model without overdispersion, because the math will be simpler. The poisson model predicts the probability of observing $y_i$ to be a particular non-negative discrete number $$P(y_i|X) = \dfrac{\exp(-\lambda_i)\lambda_i ^{y_i}}{y_i!}$$

The conditional mean of this distribution $\lambda_i$. $$E[y_i|x_i] = \lambda_i = \exp(x_i\beta)$$ $$\log \lambda_i = x_i\beta$$ The conditional variance of the poisson model is also $\lambda_i$, but the variance of the negative binomial model is $\lambda_i + \alpha \lambda_i$. This is the only practical difference between the two models for the purposes of this answer.

This is effectively a log-linear model. So the marginal effect of $x$ on $\lambda$ can be shown as

$$\dfrac{\partial E[y|x]}{\partial x} = \dfrac{\partial\lambda_i}{\partial x} = \exp(\beta)$$

So if you have a negative $\beta$ for a dummy variable $x$, you can say that "on average, $x$ lowers the expected value of $\log(y)$ by $\beta$*100 percent."

Solved – fisher information matrix of Negative Binomial distribution

The negative binomial distribition parametrized by mean and size can be given by $$ \DeclareMathOperator{\P}{\mathbb{P}} \P (X=k) = \binom{k+m-1}{k}\left( \frac{m}{m+\mu} \right)^m \left( \frac{\mu}{m+\mu} \right)^k $$ for the outcome $k$ a nonnegative integer, and $\mu>0$ the mean, $m>0$ the size. I will do the calculations by maple.

The Fisher information matrix (of size $2\times 2$) has components $I_{\mu\mu}, I_{\mu m} \text{ and } I_{m m}$ given by $$ \DeclareMathOperator{\E}{\mathbb{E}} I_{ij}=-\E\left\{ \frac{\partial^2}{\partial \theta_i \partial\theta_j}\log f(X;\theta)|\theta \right\} $$ where here $\theta=(\mu, m)$. Then we (maple code at the end of post) find $$ I_{\mu\mu}=\frac{m}{(m+\mu)\mu} $$ The diagonal term is simplest, it reduces to zero! That is the beauty of the mean parametrization $$ I_{\mu m}=0 $$ showing that $\mu$ and $m$ are orthogonal parameters.

For the last term, the result will involve a trigamma function written $\Psi(1,\cdot)$ (second derivative of log of gamma function) and result will be a somewhat complex infinite series, which must be evaluated numerically: $$ I_{mm}=\sum_{k=0}^\infty \binom{k+m-1}{k}\left\{ -m^{m-1}\mu^k (m+\mu)^{-m-2-k} \left( m(m+\mu)^2 \Psi(1,k+m) -m(m+\mu)^2 \Psi(1,m) +mk+\mu^2 \right) \right\} $$ A concise form can be derived either by simplifying the expression that Maple given above:

\begin{align*} I_{mm} =& -\sum_{k=0}^\infty\binom{k+m-1}{k}\left(\frac{m}{m+\mu}\right)^m \left(\frac{\mu}{m+\mu}\right)^k\{\frac{1}{(m+\mu)^2m}\left(m(m+\mu)^2\Psi(1,k+m)-m(m+\mu)^2\Psi(1,m)+m k+\mu^2\right)\}\\ =& -\mathbb{E}\left(\frac{1}{(m+\mu)^2m}\left(m(m+\mu)^2\Psi(1,X+m)-m(m+\mu)^2\Psi(1,m)+m X+\mu^2\right)\right)\\ =& -\mathbb{E}\left(\frac{1}{(m+\mu)^2m}\{m(m+\mu)^2(\Psi(1,X+m) - \Psi(1,m))+m X +\mu^2\}\right)\\ =& -\mathbb{E}\left(\Psi(1,X+m) - \Psi(1,m)\right) - \frac{\mu}{m(m+\mu)} \end{align*} where $X$ follows negative binomial distribution with mean $\mu$ and size $m$.

Or by definition of Fisher information \begin{align*} I_{mm} =& - \mathbb{E}\frac{\partial^2}{\partial m^2}\ln \mathbb{P}(X;\mu,m) \\ =& - \mathbb{E}\frac{\partial}{\partial m} \{\Psi(X+ m) - \Psi( m) + \ln\frac{ m}{ m+\mu} + \frac{\mu -X }{ m+ \mu}\}\\ =& - \mathbb{E} \{\frac{\partial}{\partial m}(\Psi(X+ m) - \Psi( m)) + + \frac{1}{ m}-\frac{1}{ m+\mu}-\frac{\mu - X}{( m+\mu)^2}\}\\ =& -\mathbb{E}\frac{\partial}{\partial m}\left(\Psi(X+ m) - \Psi( m)\right) -\frac{\mu}{ m( m+\mu)} \\ =& -\mathbb{E}\left(\Psi(1,X+ m) - \Psi(1, m)\right) -\frac{\mu}{ m( m+\mu)} \end{align*} where $\Psi(\cdot)$ is the digamma function (first derivative of log of gamma function).

Below some maple code (and output):

f :=  binomial(k+m-1,k)*(m/(m+mu))^m * (mu/(m+mu))^k
                                            m         k
                                    /  m   \  /  mu  \ 
        f := binomial(k + m - 1, k) |------|  |------| 
                                    \m + mu/  \m + mu/ 
lf := ln( binomial(k+m-1,k) ) + m*ln( m/(m+mu) ) + k* ln( mu/(m+mu) ) assuming m>0,mu>0;
                                        /  m   \       /  mu  \
 lf := ln(binomial(k + m - 1, k)) + m ln|------| + k ln|------|
                                        \m + mu/       \m + mu/

simplify( -sum(f*diff(lf,mu,mu),k=0..infinity ) ) assuming m>0,mu>0;
                               m     
                          -----------
                          (m + mu) mu

simplify( -sum(f*diff(lf,mu,m),k=0..infinity ) ) assuming m>0,mu>0;
                               0

simplify( -sum(f*diff(lf,m,m),k=0..infinity ) ) assuming m>0,mu>0;
 infinity                                                         
  -----                                                           
   \                                                              
    )     (m - 1)         (-m - 2 - k)   k                        
-  /     m        (m + mu)             mu  binomial(k + m - 1, k) 
  -----                                                           
  k = 0                                                           

  /          2                           2                     2\
  \m (m + mu)  Psi(1, k + m) - m (m + mu)  Psi(1, m) + m k + mu /

Best Answer

Related Solutions

Solved – How to interpret coefficients of categorical predictors in the negative binomial regression model

Solved – fisher information matrix of Negative Binomial distribution

Related Question