GLM Analysis – Calculating Deviance for Gamma Generalized Linear Model

deviancegamma distributiongeneralized linear model

I was wondering why the Gamma deviance formula is given as: $$2 \sum [ -log(\frac{y_i}{\mu_i}) + \frac{y_i-\mu_i}{\mu_i} ] $$

Shouldn't the 2nd term become zero after the summation is conducted?

Best Answer

The general derivation of the deviance for a GLM family is given in Section 5.4 of Dunn and Smyth (2018) (the book that you mentioned in a previous post). You can insert the form of the gamma density to get the result, but the density has to parametrized in the right way.

A common way to write the gamma density is as $$f(y;\alpha,\beta)=\frac{y^{\alpha-1}e^{-y/\beta}}{\beta^\alpha\Gamma(\alpha)}$$ with $E(y)=\alpha\beta=\mu$ and var$(y)=\alpha\beta^2=V(\mu)\phi$ with $V(\mu)=\mu^2$ and $\phi=1/\alpha$. Converting to the log-scale gives $$\log f(y;\alpha,\beta)= -y/\beta-\alpha\log\beta+(\alpha-1)\log y-\log\Gamma(\alpha)$$ Reparametrizing to $\mu$ and $\phi$ gives $$\log f(y;\mu,\phi)= t(y,\mu)/\phi+a(y,\phi)$$ with $$t(y,\mu)=-y/\mu-\log\mu$$ and $$a(y,\phi)=(\log\phi)/\phi+(1/\phi-1)\log y -\log\Gamma(1/\phi).$$ The unit deviance is defined as $$d(y,\mu)=2\left\{t(y,y)-t(y,\mu)\right\}$$ with in this case $$t(y,y)-t(y,\mu)=-1-\log y+ y/\mu+\log\mu=(y-\mu)/\mu-\log(y/\mu).$$

Finally, the total deviance is $$D=\sum_{i=1}^n w_i d(y_i,\mu_i)$$ where the $w_i$ are the prior weights. If the prior weights are all 1, then this agrees with the deviance formula in your question.

It is true that the $(y-\mu)/\mu$ terms often do sum to zero when evaluated at the fitted values, $\mu_i=\hat\mu_i$, but not always. The GLM maximum likelihood equations solve $$\sum_{i=1}^n w_i x_{ij} \frac{y_i-\mu_i}{g'(\mu_i) V(\mu_i)}=0$$ where the $x_{ij}$ are covariate values, $g'$ is the derivative of the link function and $V(\mu)$ is the variance function. For the gamma distribution, $V(\mu)=\mu^2$. If a log-link is used, then $g'(\mu) = 1/\mu$. If a log-link is used and the covariates include an intercept term, then the likelihood equations imply $$\sum_{i=1}^n w_i \frac{y_i-\mu_i}{\mu_i}=0$$ In this case, the $(y-\mu)/\mu$ terms will not contribute to the total deviance. Nevertheless, the terms can't be ignored entirely because they are part of the unit deviances and will still contribute to the deviance residuals.

Reference

Dunn, PK, and Smyth, GK (2018). Generalized linear models with examples in R. Springer, New York, NY. https://www.amazon.com/Generalized-Linear-Examples-Springer-Statistics/dp/1441901175