I have a dataset which seems to have a lot of zeroes. I have already fit a poisson regression model as well as a negative binomial model. I would like to fit zero-inflated and hurdle models as well.
Before I do I would like to run a test to investigate whether my data really is zero inflated. What test(s) is/are there to determine whether my data are zero-inflated?
Best Answer
The score test (referenced in the comments by Ben Bolker) is performed by first calculating the rate estimate $\hat{\lambda}= \bar{x}$. Then count the number of observed 0s denoted $n_0$ and the total number of observations $n$. Calculate $\tilde{p}_0=\exp[-\hat{\lambda}]$. Then the test statistic is calculated by the formula: $\frac{(n_0 - n\tilde{p}_0 )^2}{n\tilde{p}_0(1-\tilde{p}_0) - n\bar{x}\tilde{p}_0^2}$. This test statistic has a $\chi^2_1$ distribution which can be looked up in tables or via statistical software.
Here is some R code that will do this: