Solved – Negative-Binomial vs Poisson for Count Data

binomial distributionpoisson distribution

Say that I was modelling the number of upvotes for a Facebook post under two treatments. What would be a more ideal distribution for modelling these counts: negative-binomial or a Poisson?

What would be the trade-offs here?

Best Answer

For $Y \sim$ Poisson$(\lambda)$ we have $\text{E}(Y) = \lambda$ and $\text{Var}(Y) = \lambda$, so the variance should at least be well-approximated by the mean. Often in real applications we find that the variance for count data is larger than the mean.

If on the other hand $Y \sim$ negative binomial$(r, p)$ we have $\text{E}(Y) = rp / (1 - p)$ which is less than $\text{Var}(Y) = rp / (1 - p)^2$, so the negative binomial is sometimes used to account for what's called "overdispersion." The only real downside is that the model is slightly less parsimonious, but that arguably isn't much of a downside.