Model residual diagnostics of gamma GLMM with log-link

gamma distributiongeneralized linear modelglmmglmmtmbresiduals

I am trying to model fish length data (N > 115.000) which are highly right-skewed using linear mixed effects models. Actually all data make sense and the the extreme high values are valid measurements. My data look like: .

I thought of using a GLMM of the family "Gamma" (with log-link) as length is a censored response (0,∞). Specifically, I am using the function glmmTMB() to fit following model structure:

    mod <- glmmTMB(L ~ gear + (1|species) + (1|location),
            family=Gamma(link = "log"),
            data=fish_df)

The model converged without any errors/warnings and the parameter estimates are reasonable, I think. However, I wanted to check/diagnose the model, especially the residuals. I was following some posts in other threads such as "What are the assumptions of a Gamma GLM or GLMM for hypothesis testing?" and "Coping with underdispersion after DHARMa diagnosis on GLMM with Gamma link=log", as I am observing similar issues of rather extreme values in my response that lead to skewed residuals of my models, I guess. So it seems that there is some heteroscedasticity in my residuals. I mean, given the large sample size it is not too surprising to see also supposed 'outliers'.

    ypred = predict(mod)
    res = residuals(mod, type = 'pearson')
    plot(ypred,res)
    hist(res)

I further checked with the DHARMa package for some patterns which are difficult to interpret to me.

    sim_res <- DHARMa::simulateResiduals(modX5, 250)
    plot(sim_res,asFactor = F)
    DHARMa::testDispersion(mod)
    DHARMa::testDispersion(mod, type="PearsonChisq", 
         alternative="greater")

The dispersion tests do not detect any significant overdispersion in the residuals but rather low dispersion parameters (for PearsonChisq = 0.13703, for DHARMa nonparametric dispersion test = 0.28605). In fact the gamma has already a dispersion parameter, so there should be no 'overdispersed' gamma according to this post.

So I am wondering whether the I could trust the model outcomes and estimates for fixed effects? To have more robust/reliable information on parameter estimates I would use parametric bootstrapped CI e.g. of the fixed effects via bootMer (but this does not solve the model fit per se). So are there any other recommendation I could try to improve the model fit, or is the fit sufficiently okay?

Best Answer

As the developer of DHARMa, I think I can answer at least some of your questions:

First DHARMa plot, left: the qq plot basically tells you that the residual distribution doesn't follow a gamma distribution. To me, it looks as if your residuals have more mass mass in the centre of the distribution than what would be expected in a gamma, which suggests a kind of ucnderdispersion (see also point 3).
First DHARMa plot, right: as you seem to have only two factor levels, you shouldn't force plot(sim_res,asFactor = F), this plot will be hard to interpret. Use the default plot. Probably, there is not much to be seen here though.
Regarding the dispersion: while it is true that the gamma has a dispersion parameter, the value of this parameter is optimised using the MLE. If the distribution you fit is not a gamma, the MLE for the dispersion parameter does not necessarily coincide with a "correct" dispersion value according to the test statistics that are used in the dispersion tests in DHARMa. Thus, it is not surprising to see the values that you find here, and the origin is likely that your residual distribution is not gamma.
Regarding res ~ pred plot that you did by hand: here, in pred, you include the REs. Note that this can lead to misleading residual patterns, see https://github.com/florianhartig/DHARMa/issues/43

Overall, my assessment is that your residual distribution is not really following a gamma. The rest of the patterns is a consequence of this problems. As a remedy, you could try transformations, move to quantile regressions, or just ignore it. The deviation is notable, but given that the dispersion is roughly OK and you have tons of data, it probably won't make a big difference on the inference.

Related Solutions

Solved – Numerical stability of IWLS for Gamma models with log-link

I wouldn't say I'm expert on this, so I may well miss something, but here are some relevant points:

With optimization, whether one is doing nonlinear least squares, GLMs or something else, a good set of starting values can be important. With gamma+log-link, a robust linear model fitted to the logs may be sufficient in many cases. R's glm allows you to specify starting values.
While Fisher scoring (which is effectively IWLS on a transformed problem) is generally quite good, it doesn't converge nicely in all circumstances. There are a huge number of other optimization algorithms available, and glm will let you supply it either with a function or with a string naming a function to use to do the optimization.

The topic of optimization is incredibly broad (in the 'takes a book to answer' sense), it's not something that can be tackled in a few lines.

Solved – GLMM hurdle model for continuous data -Truncated negative binomial family in glmmTMB

I'm not sure why you say that glmmTMB can't handle zero-inflated Gamma responses: the glmmTMB news file says (for version 1.0.0, release 2020-02-03):

new ziGamma family (minor modification of stats::Gamma) allows zero-inflation (i.e., Gamma-hurdle models)

I'd say it's not crazy to use a truncated negative binomial, but I'd be worried as it doesn't make statistical sense (technically, the likelihood of any non-integer value is 0 ...) (If you really had count data, a zero-inflated NB rather than a hurdle would be a reasonable option ...)

Given the distribution/density functions for each distribution, parameterized in terms of the mean $\mu$ and a shape/dispersion parameter ($k$ for NB, $a$ for Gamma):

$$ \textrm{NB}: \qquad \frac{(k/(k+\mu))^k}{\Gamma(k)} \cdot \frac{\Gamma(k+x)}{x!} \cdot (\mu/(k+\mu))^x $$

$$ \textrm{Gamma}: \qquad \frac{1}{(\mu/a)^a \Gamma(a)} \cdot x^{a-1} \cdot e^{-(x/(\mu/a))} $$

I think you may? be able to show that NB converges approximately to Gamma for large $x$ (but someone better/more dedicated than I am will need to do the math ...)

An empirical demonstration (not "proof"!)

hist(rnbinom(100000,mu=100,size=2),freq=FALSE,ylim=c(0,0.008), breaks=100)
curve(dgamma(x,scale=100/2,shape=2),add=TRUE,col=2,lwd=2)

This definitely doesn't work for small mean (try it with mean=4 rather than 100 ...)

Best Answer

Related Solutions

Solved – Numerical stability of IWLS for Gamma models with log-link

Solved – GLMM hurdle model for continuous data -Truncated negative binomial family in glmmTMB

Related Question