Solved – Mixture of Poisson and negative binomial

maximum likelihoodnegative-binomial-distributionpoisson distribution

I'm trying to fit a Poisson and negative binomial distribution to my data and compare the two; but the problem is that the Poisson fails to capture the overdispersion and the negative binomial seems to overfit and also have difficulty in estimating the dispersion parameter. I was wondering if a weighted combination of the two would yield better results; but in that case how would we estimate the parameters and the weighting factor?

Best Answer

As your question is related to a Poisson and negative binomial distribution, I assume you are referring to count data, for example a number of goals made during a game, and you also registered the number of training hours of the last year.

Assume you want to model this data using the training hours as explanatory variable, as you assume the player that trains a lot makes more goals. You would assume that this explanatory variable would explain some of the variation in goals made during a game.

However, the Poisson model has as an assumption that the mean of the observed data is equal to the variance. Very often, you end up with a greater variance than demanded by Poisson, and you have to deal with over dispersion (including other predictor variables will further reduce the variance). Negative binomial models can be used in that case.

But even more important is that the zeros in the data have a different "data generating process" than the other counts, as maybe 20% of the kids did not even play during the game and had therefore not even the chance of making a goal. This also leads to lots of unexplained variation, which makes Poisson even less useful. In that case you model the zeros differently, with either zero inflated models or hurdle models.

You can find this summary better explained in this online class: http://www.youtube.com/watch?v=T6cv8n9xBGQ