Probability – How to Estimate Predicted Probability from a Negative Binomial Regression Equation

count-datagamma distributionnegative-binomial-distributionprobability

I'm trying to estimate the predicted probabilities of an observation being a particular integer, $y$, after a negative binomial regression model. Long's Regression models for categorical and limited dependent variables gives this predicted probability as (pg.237):

$$
\hat{\text{Pr}}(y \mid x) = \frac{ \Gamma(y + \hat{a}^{-1}) }{ y!\Gamma(\hat{a}^{-1}) } \left( \frac{\hat{a}^{-1}}{\hat{a}^{-1}+\hat{\mu}}\right)^{\hat{a}^{-1}} \left( \frac{\hat{\mu}}{\hat{a}^{-1}+\hat{\mu}} \right)^y
$$

Where $\hat{\mu}$ is the predicted mean of the variable, $\hat{a}$ is the dispersion estimate, and $\Gamma$ is the Gamma function. Now, my question is the statistical software I use takes both a shape and a scale parameter for the $\Gamma$ distribution, so I am confused as to how to actually estimate the predicted probabilities for any particular integer $y$.

In the above equation, what does Long expect me to supply as the shape and the scale for the $\Gamma$ function?

Best Answer

I'm going to take a guess at where the problem you're having has arisen, and you can correct me if I have it wrong.

You define a fitted negative binomial and say:

and $Γ$ is the Gamma function

Right, that all makes sense. Then comes

the statistical software I use takes both a shape and a scale parameter for the $Γ$ distribution

And this is where I think your confusion arises. The Gamma* distribution and the Gamma function are different things (though there's a connection between them!)

* Before anyone jumps on my bad English, I'm using "Gamma" function for $\Gamma(.)$ to distinguish it from the "gamma" function, $\gamma(.)$, and by extension, the same convention for the distribution, which is also often denoted $\Gamma$. (This seems to be fairly common usage, even if it doesn't strictly follow English rules, I think it's necessary to reduce potential confusion, especially in this question, where there's already confusion over different meanings of the word.)

The Gamma function is a function of a single argument as in your formula above.

The usual form of the Gamma distribution has two parameters. There's no inconsistency there, they're quite different objects.

The Gamma function:

$$\Gamma (t)=\int _{0}^{\infty }x^{{t-1}}e^{{-x}}\,{{\rm {d}}}x.$$

Note that this is just a function (though one that arises frequently):

enter image description here

Two related functions are the incomplete Gamma function and the incomplete gamma function:

$\Gamma (s,x)=\int _{x}^{{\infty }}t^{{s-1}}\,e^{{-t}}\,{{\rm {d}}}t,$ and

$\gamma (s,x)=\int _{0}^{x}t^{{s-1}}\,e^{{-t}}\,{{\rm {d}}}t.$

and where $\Gamma(t) = \lim_{x\rightarrow\infty} \gamma (t,x)$


The Gamma distribution

Consider constructing a cdf as follows:

$$F(x;k) = \gamma (k,x)/\Gamma(k)$$

This has the value $0$ at $x=0$ and approaches $1$ as $x\rightarrow\infty$.

The density can be obtained by differentiation, giving:

$$f(x;k)={\frac {x^{{k-1}}e^{{-{x}}}}{\Gamma (k)}}\quad {\text{ for }}x>0{\text{ and }}k>0.$$

So far this only has one parameter, the shape. We get the second parameter by adding a scaling factor; if $Z$ has the above one-parameter-Gamma form, let $X=\lambda Z$, and $X$ will be a two-parameter Gamma.

Beware: there are two common forms for the two parameter Gamma distribution]*, the 'rate form' and the 'scale form'. Both forms are given at the Wikipedia page on the Gamma distribution

* there's a third, slightly less common parameterization of the two parameter Gamma, used with GLMs -- the shape-mean form.

Here's the scale form for the density:

$f(x;k,\theta )={\frac {x^{{k-1}}e^{{-{\frac {x}{\theta }}}}}{\theta ^{k}\Gamma (k)}}\quad {\text{ for }}x>0{\text{ and }}k,\theta >0.$

Here's the rate form

$f(x;k,\beta )={\frac {\beta^{k}x^{{k-1}}e^{{-{x\beta }}}}{\Gamma (k)}}\quad {\text{ for }}x>0{\text{ and }}k,\beta >0.$

Here's a Gamma density with shape parameter 3 (and unit scale or rate):

Gamma density with shape parameter 3

When you use functions in a program for the cdf, pdf, quantile function and for random number generation, you have to make sure you're using the same convention (shape or rate) as the program!

In R, for example, one can do things with the Gamma distribution via functions like dgamma, pgamma, qgamma and rgamma (which by default use the shape and rate parameterization, you need to use a named parameter to get the scale parameterization). On the other hand, for the Gamma function you'd use a call to gamma (one parameter).

Related Question