Solved – Deviance in generalized linear models for different families

deviancegeneralized linear modelregression

I'm a little confused by the deviance value returned by deviance(glm.model). I get quite different values for the same data fitted to different GLMs using family=Gamma(link=identity) (models 1/2) and gaussian(link=identity) (models 3/4).

> print(r)
         Deviance      AIC
Model1   44.96093 2530.558
Model2   45.13543 2528.683
Model3 3028.56880 2487.124
Model2 3299.93121 2739.563

I read that there are several types of deviance calculated within glm() which I tried to see using:

> deviance(mod1)
[1] 44.96093
> deviance(mod1,type="resp")
[1] 44.96093
> deviance(mod1,type="dev")
[1] 44.96093

which are all the same indicating to me the 'type=' is not the correct way to select them.

Could anyone tell me the right way to get comparable deviance values out of glm() for different families?

Thanks
Chris

Best Answer

Here is the situation as I understand it: you can compare the general goodness of fit test across different GLM models if the dispersion parameter $\phi$ is known with certainty for the models you are comparing. With $\phi$ I mean the exponential family's dispersion parameter as used in this link.

So for Poisson (count regression) and Binomial (logistic regression) we know that $\phi=1$ and can thus legitimately compare fits on exactly the same data but with a different link function (a poisson and a count). But we cannot compare fits on that data using link function for the gamma, inverse gamma, gaussian, etc.

The reason is that the deviance (which is a likelihood ratio between a fully fitted model and your own model) is a function of the difference in estimates between the two models. But to get to the scaled deviance (which is $\chi^2$ distributed and makes the models comparable) we need to divide by $\phi$ which we do not always know.

I know this sounds odd when you know about F-tests for significance in nested GLMs using Deviance (whether they be gaussian or not) but with nested models the variance is estimated from the data and it is allowed for nested models of the same type on the same data. I have to say I cannot pin down for sure why it is allowed to estimate the dispersion here (my only guess is that across different link functions you cannot really quantify the variance of your $\hat{\phi}$). So OK to compare nested models, no matter what the link function.

Edit: with regards to your comment of just using SSE (btw. if you are going that route I recommend RMSE (link,link)), you can always get the RMSE for any model and say that this seems to fit better than that in a RMSE sense (just like you could do it with the deviances above and ignore the theory). The problem is that unlike in an OLS usually the variance of the prediction in a GLM is not constant and changes with the covariates (and the confidence intervals are not necessarily symmetric). So in theory you could get a model with a RMSE that is way higher than another model but actually have a model that is much closer to the process that generated the data. So if you do compare RMSE on the same data set avoid sweeping statements that completely dismiss one versus the other. And if you do want to use it to make a selection you should show that RMSE is consistently indicative of a better model by using cross-validation and sampling in the data (i.e. show that a good RMSE in sample means good RMSE out of sample). Btw. Deviance is RSS in the case of OLS.