Solved – If not a Poisson, then what distribution is this

distributionsmeanpoisson distributionrsample

I have a data set containing the number of actions performed by individuals over the course of 7 days. The specific action shouldn't be relevant for this question. Here are some descriptive statistics for the data set:
$$ \begin{array}{|c|c|} \hline \text{Range} & 0 – 772 \\ \hline
\text{Mean} & 18.2 \\ \hline
\text{Variance} & 2791 \\ \hline
\text{Number of observations} & 696 \\ \hline
\end{array}
$$

Here is a histogram of the data:
action histogram

Judging from the source of the data, I figured it would fit a Poisson distribution. However, the mean ≠ variance, and the histogram is heavily weighted to the left. Additionally, I ran the goodfit test in R and got:

> gf <- goodfit(actions,type="poisson", method = "MinChisq") <br>
> summary(gf) <br>
Goodness-of-fit test for poisson distribution <br>
X^2                   df         P(> X^2) <br>
Pearson 2.937599e+248 771        0  

The Maximum Likelihood method also yielded p-value = 0. Assuming the null hypothesis is: the data matches a Poisson distribution (the documentation doesn't specify this), then the goodfit test says we should reject the null hypothesis, therefore the data does not match a Poisson distribution.

Is that analysis correct? If so, what distribution do you think will fit this data?

My ultimate goal is to compare the mean number of actions between 2 samples to see if the means are different; is checking the distribution even necessary? My understanding is the typical tests (z-,t-,$\chi^2$ tests) don't work for Poisson distributions. What test should I use if the data is indeed Poisson-distributed?

Best Answer

If variance is greater than the mean then this is called over-dispersion. A natural model for this is the negative binomial distribution. This can also be seen as a Poisson distribution where the Parameter lambda follows a Gamma distribution. A first and easy step could be to fit a negative binomial distribution.

Related Question