Solved – How to interpret parameters in GLM with family=Gamma

gamma distributiongeneralized linear modelinterpretationr

I have a question regarding parameter interpretation for a GLM with a gamma distributed dependent variable. This is what R returns for my GLM with a log-link:

Call:
glm(formula = income ~ height + age + educat + married + sex + language + highschool, 
    family = Gamma(link = log), data = fakesoep)

Deviance Residuals: 
       Min        1Q    Median        3Q       Max  
  -1.47399  -0.31490  -0.05961   0.18374   1.94176  

Coefficients:
              Estimate Std. Error t value Pr(>|t|)    
(Intercept)  6.2202325  0.2182771  28.497  < 2e-16 ***
height       0.0082530  0.0011930   6.918 5.58e-12 ***
age          0.0001786  0.0009345   0.191    0.848    
educat       0.0119425  0.0009816  12.166  < 2e-16 ***
married     -0.0178813  0.0173453  -1.031    0.303    
sex         -0.3179608  0.0216168 -14.709  < 2e-16 ***
language     0.0050755  0.0279452   0.182    0.856    
highschool   0.3466434  0.0167621  20.680  < 2e-16 ***
---
Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1

(Dispersion parameter for Gamma family taken to be 0.1747557)

Null deviance: 757.46  on 2999  degrees of freedom
Residual deviance: 502.50  on 2992  degrees of freedom
AIC: 49184

How do I interpret the parameters? If I calculate exp(coef()) of my model, I get ~ 500 for the intercept. Now I believe that doesn't mean the expected income if all other variables are held constant, does it? Since the average or mean(age) lies at ~ 2000. I furthermore have no clue how to interpret the direction and value of the covariates' coefficients.

Best Answer

The log-linked gamma GLM specification is identical to exponential regression:

$$E[y \vert x,z] = \exp \left( \alpha + \beta \cdot x +\gamma \cdot z \right)=\hat y$$

This means that $E[y \vert x=0,z=0]=\exp(\alpha)$. That's not a very meaningful value (unless you centered your variables to be be mean zero beforehand).

There are at least three way to interpret your model. One is to take derivative of the expected value of $y$ given $x$ with respect to $x$:

$$\frac{\partial E[y \vert x,z]}{\partial x} = \exp \left( \alpha + \beta \cdot x +\gamma \cdot z\right)\cdot \beta=\hat y \cdot \beta$$

This quantity depends on $x$ and $z$, so you can either evaluate this at the mean/median/modal or representative values of $x$ and $z$, or take the average of $\hat y \cdot \beta$ over your sample. These are both called marginal effects. These derivatives only make sense for continuous variables (like height) and tell you an additive effect of a small change in $x$ on $y$.

If $x$ was binary (like sex), you might consider calculating finite differences instead: $$E[y \vert z,x=1]-E[y \vert z,x=0]=\exp \left( \alpha + \beta +\gamma \cdot z\right) - \exp \left( \alpha +\gamma \cdot z\right)= \exp \left( \alpha +\gamma \cdot z\right) \cdot\left( \exp(\beta)-1 \right)$$

This makes more sense since it's hard to imagine an infinitesimal change in sex. Of course, you can also do this with a continuous variable. These are additive effects from a one unit change in $x$, rather than a tiny one.

The third method is to exponentiate the coefficients. Note that:

$$ \begin{array} _E[y \vert z,x+1] &= \exp \left( \alpha + \beta \cdot (x+1) +\gamma \cdot z \right) \\ &=\exp \left( \alpha + \beta \cdot x+\beta +\gamma \cdot z \right)\\ &=\exp \left( \alpha + \beta \cdot x +\gamma \cdot z \right)\cdot \exp(\beta) \\ &= E[y \vert z,x]\cdot \exp(\beta) \end{array} $$

This means that you can interpret the exponentiated coefficients multiplicatively rather than additively. They give you the multiplier on the expected value when $x$ changes by 1.