Well, quite clearly the log-linear fit to the Gaussian is unsuitable; there's strong heteroskedasticity in the residuals. So let's take that out of consideration.
What's left is lognormal vs gamma.
Note that the histogram of $T$ is of no direct use, since the marginal distribution will be a mixture of variates (each conditioned on a different set of values for the predictors); even if one of the two models was correct, that plot may look nothing like the conditional distribution.
Either model appears just about equally suitable in this case. They both have variance proportional to the square of the mean, so the pattern of spread in residuals against fit is similar.
A low outlier will fit slightly better with a gamma than a lognormal (vice versa for a high outlier). At a given mean and variance, the lognormal is more skew and has a higher coefficient of variation.
One thing to remember is that the expectation of the lognormal is not $\exp(\mu)$; if you're interested in the mean you can't just exponentiate the log scale fit. Indeed, if you are interested in the mean, the gamma avoids a number of issues with the lognormal (e.g. once you incorporate parameter uncertainty in $\sigma^2$ in the lognormal, you have prediction based on the log-t distribution, which doesn't have a mean. Prediction intervals still work fine, but that may be a problem for predicting the mean.
See also here and here for some related discussions.
(1) It is hard to tell what exactly goes wrong here without a reproducible example. One possibility could be that there are no non-zero observations for certain combinations of factor levels. Another possibility could be that the theta estimate in the NB version of the model degenerates either towards zero or towards infinity and hence leads to numeric problems. It could also be something else, though...
(2) I wouldn't start testing the model that generated the warning before I figured out what went wrong in (1). I wouldn't just ignore the warning.
(3) I would recommend constructing the plots by hand rather than estimating a poorly fitting GLM. You can get fitted()
and residuals()
from the hurdle model and then call plot functions for scatter and QQ plots respectively.
Best Answer
Here is what I usually like doing (for illustration I use the overdispersed and not very easily modelled quine data of pupil's days absent from school from
MASS
):Test and graph the original count data by plotting observed frequencies and fitted frequencies (see chapter 2 in Friendly) which is supported by the
vcd
package inR
in large parts. For example, withgoodfit
and arootogram
:or with Ord plots which help in identifying which count data model is underlying (e.g., here the slope is positive and the intercept is positive which speaks for a negative binomial distribution):
or with the "XXXXXXness" plots where XXXXX is the distribution of choice, say Poissoness plot (which speaks against Poisson, try also
type="nbinom"
):Inspect usual goodness-of-fit measures (such as likelihood ratio statistics vs. a null model or similar):
Check for over / underdispersion by looking at
residual deviance/df
or at a formal test statistic (e.g., see this answer). Here we have clearly overdispersion:Check for influential and leverage points, e.g., with the
influencePlot
in thecar
package. Of course here many points are highly influential because Poisson is a bad model:Check for zero inflation by fitting a count data model and its zeroinflated / hurdle counterpart and compare them (usually with AIC). Here a zero inflated model would fit better than the simple Poisson (again probably due to overdispersion):
Plot the residuals (raw, deviance or scaled) on the y-axis vs. the (log) predicted values (or the linear predictor) on the x-axis. Here we see some very large residuals and a substantial deviance of the deviance residuals from the normal (speaking against the Poisson; Edit: @FlorianHartig's answer suggests that normality of these residuals is not to be expected so this is not a conclusive clue):
If interested, plot a half normal probability plot of residuals by plotting ordered absolute residuals vs. expected normal values Atkinson (1981). A special feature would be to simulate a reference ‘line’ and envelope with simulated / bootstrapped confidence intervals (not shown though):
Diagnostic plots for log linear models for count data (see chapters 7.2 and 7.7 in Friendly's book). Plot predicted vs. observed values perhaps with some interval estimate (I did just for the age groups--here we see again that we are pretty far off with our estimates due to the overdispersion apart, perhaps, in group F3. The pink points are the point prediction $\pm$ one standard error):
This should give you much of the useful information about your analysis and most steps work for all standard count data distributions (e.g., Poisson, Negative Binomial, COM Poisson, Power Laws).