Criterion is based upon (informed) model comparisons. You are trying to account for over-dispersion.
Poisson var(x) ~ mu
Neg Binomial var(x) > mu
"Extra" zeros
ZIP var(x) ~ mu
ZIPB var(x) > mu
One active package that you can use is install.packages("pscl")
You can then fit a number of models such as a hurdle model that uses a negative binomial for the counts and a binomial model for modeling the probability of zeros. This would be written something like:
fit <- hurdle(Admission ~ Temperature + Humidity), dist="negbin", data = data)
summary (fit)
Note that the output will have two sets of coefficients: one for the hurdle component and one for the count data. This output also provides an estimate of the theta parameter (overdispersion) of the negative binomial
Or you may want to look at the zero-inflation model
fit1<-zeroinfl(Admissions ~ Temperature + Humidity), data = data,dist="negbin",link="logit")
These models can be examined with AIC (also compare these models to your Poisson model...)
AIC(fit,fit1)
Note that the predicted value in a GLM is a mean.
For any distribution on non-negative values, to predict a mean of 0, its distribution would have to be entirely a spike at 0.
However, with a log-link, you're never going to fit a mean of exactly zero (since that would require $\eta$ to go to $-\infty$).
So your problem isn't a problem with the Tweedie, but far more general; you'd have exactly the same issue with the ordinary Poisson (whether zero-inflated or ordinary Poisson GLM) for example, or a binomial, a 0-1 inflated beta and indeed any other distribution on the non-negative real line.
I thought the usefulness of the Tweedie distribution comes from its ability to predict exact zeros and the continuous part.
Since predicting exact zeros isn't going to occur for any distribution over non-negative values with a log-link, your thinking on this must be mistaken.
One of its attractions is that it can model exact zeros in the data, not that the mean predictions will be 0. [Of course a fitted distribution with nonzero mean can still have a probability of being exactly zero, even though the mean must exceed 0. A suitable prediction interval could well include 0, for example.]
It matters not at all that the fitted distribution includes any substantial proportion of zeros - that doesn't make the fitted mean zero (except in the limit as you go to all zeros).
Note that if you change your link function to say an identity link, it doesn't really solve your problem -- the mean of a non-negative random variable that's not all-zeros will be positive.
Best Answer
You should use the Mann-Whitney U-test if the samples are not paired. The Wilcoxon signed rank test is for paired data. I don't think that the number of zeros matter in this case.