Solved – ny relation between Power Law and Negative Binomial distribution

distributionsmodelingprobabilityr

In a social experiment that I was conducting, I was trying to count the number of people each user contacted in a period of 10 days. The population size was 100 for the experiment. Based on the values that I calculated, I fit a negative binomial distribution to the data (the Q-Q plot is given below).

Conventional wisdom says that most networks amongst humans follow a power law distribution. I am guessing that my population size is too small to make a full conclusion about anything but is there any kind of relation between a negative binomial distribution and a power law distribution? I am asking this because I read a few days back that Normal Distribution and Gamma distribution (whose discrete analogue is the negative binomial) have a special role in that many other distributions can be derived from the Gamma distribution. I am wondering if this is true even with the power law distribution. I am a beginner in statistics so kindly point me in the right direction if I am out of track.

alt text

Best Answer

There are many power-law distributions, so you have a lot of possible models. You might start by trying to fit a log-series distribution, which is a limiting case of the negative binomial.

Don't think a priori that you have a mixture distribution as suggested by whuber until you've estimated model parameters and done at least a goodness of fit test. Long-tail distributions, like power-law, log-series, Zipf, etc., typically have what look like outliers in the right-hand tail; their separation from the bulk of the observations is just an artifact of (relatively) small sample size. Mixtures are a pain in the butt to estimate, since some regions overlap. You can often avoid that sort of problem by stepping up your modeling one level with something like Poisson regression, assuming you have some covariate (predictor) data about each user -- this basically does the mixing for you.

The Johnson, Kemp, and Kotz reference given at the end of the referenced Wikipedia article has everything you'd ever want to know about all these distributions, including many methods of parameter estimation.

Related Question