Solved – GLM diagnostics and Deviance residual

deviancegeneralized linear modelregressionresiduals

From my understanding, the deviance residual of a GLM model, when plotted against the fitted values, should give a scatterplot distributed with mean 0 and constant variance? Does this hold for any GLM family with any link function? I am mostly interested in Gamma GLM with the identity link function right now.

Best Answer

Deviance residuals will not in general have 0 mean; they don't for Gamma models.

However the mean deviance residual tends to be reasonably close to 0.

Here's an example of a residual plot from a simple identity link gamma fit (to simulated data for which the model was appropriate; in this case the shape parameter of the gamma was 3):

![enter image description here

The plot on the left is a typical deviance residuals vs fitted type plot. The one on the right splits the fitted values into bins so we can use boxplots to help judge whether the spread is near constant; the 0 line is marked in red.

As you can see from the boxplots, judging from the IQR, the spread is pretty much constant (with some random variation at the right where there are few values), but the medians there are consistently below 0. We can see that (in this case) the deviance residuals appear to be close to symmetric.

The mean deviance residual for this model is -0.1126, (marked in blue) which is very close to where those marked medians are sitting. With such a big sample, this mean is many standard errors from 0, but the mean is still "near" 0 (in the sense that the standard deviation of the residuals is more than 5 times larger than 0.1126).

Based on simulations, it looks like (as long as n is large and the shape parameter is not too small) the average deviance residual for a Gamma will be about $-\frac{1}{3\alpha}$, where $\alpha$ is the common shape parameter for the gamma-distributed response. The relationship comes in fairly well by about $\alpha=2$, but much below that it tends to overestimate.

In summary: the mean deviance residual should be close to constant, with close to constant variance, but the mean of the deviance residuals should be "near" 0 rather than 0.