Generalized Linear Model – Analyzing Residuals from GLM Model with Log Link Function

generalized linear modelrresiduals

In the following example:

x <- rnorm(100)
y <- rpois(100, exp(1+x))+10
g1 <- glm(y ~ x, family = Gamma(link = "log"))

par(mfrow = c(2,2))
plot(y~x)
plot(y,resid(g1))
plot(y,resid(g1$lm))

I am trying to look at the pattern in the predictor variable, i.e. if the model is 'correct' there should be no patterns left in the residuals when plotted against the predictor. From the example shown, which residual am I suppose to use, resid(g1) or resid(g1$lm), or something else?

Same goes when using a gam, is that resid(g1$lm), resid(g1$gam)?
I find this very confusing.

Best Answer

If you want to check the functional form, you would want to plot the partial residuals vs the predictor

resids = residuals(gl, type="partial")
plot(x, resids)

If the functional form of the predictor is correct, then the relationship between x and the partial residuals would appear linear. The plot does not look linear in your case. When fitting a glm with a log link for poisson data, one would assume that $log( E(y) ) = \beta_0 + \beta_x$ so that $E(y) = e^{\beta_0 + \beta_x}$. In your simulation, you have an additional term added to the poisson model which alters the mean variance relationship of the poisson model which assumes that $E(Y)=Var(Y)$. In your case, $E(Y) = Var(Y) + 10$. To check the mean variance relationship, you could plot the pearson residuals versus the fitted values. These residuals should have constant variance and mean 1 if the mean variance relationship is correctly specified.

plot( fitted(g1), residuals(g1, type="pearson"))