Solved – How to interpret parameters of GLM output with Gamma log link

gamma distributiongeneralized linear model

I am having tough time interpreting the output of my GLM model with Gamma family and log link function. My dependent variable if "Total Out-of-pocket cost" and my independent variables are "Private health insurance(yes/no)", "year of diagnosis" and "interaction with private health insurance and year". I am trying to find the trend over a period of 4 years in Out-of-pocket costs depending on the insurance status. Also please tell me what does +ve and -ve coeffiecients mean.

My code and the output is as mentioned below.

Thank you very much for the help.

Call:
glm(formula = total_oop ~ private_insur2 + year + private_insur2 * 
    year, family = Gamma(link = "log"), data = dfq5.1)

Deviance Residuals: 
    Min       1Q   Median       3Q      Max  
-3.2932  -1.2051  -0.5681   0.2311   4.8237  

Coefficients:
                         Estimate Std. Error t value Pr(>|t|)  
(Intercept)            -278.75702  128.19627  -2.174   0.0298 *
private_insur2Yes       166.72653  150.45167   1.108   0.2680  
year                      0.14184    0.06370   2.227   0.0261 *
private_insur2Yes:year   -0.08207    0.07475  -1.098   0.2725  
---
Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1

(Dispersion parameter for Gamma family taken to be 1.911381)

    Null deviance: 2631.2  on 1399  degrees of freedom
Residual deviance: 2098.6  on 1396  degrees of freedom
AIC: 24676

Number of Fisher Scoring iterations: 7
```

Best Answer

The formula for the predicted mean value in your regression is

$$ \textrm{total_oop} = \exp \left(\beta_0 + \beta_1 \cdot \textrm{PI} + \beta_2 \cdot\textrm{year} + \beta_3 \cdot \textrm{PI} \cdot\textrm{year} \right) $$

where PI is a dummy variable equal to 0 if someone doesn't have private insurance and 1 if they do.

  • The intercept ($\beta_0$) is the expected log of OOP costs for someone without insurance in year 0 (!!). The very small value (-278) is pretty much nonsense, it means that if you extrapolated back to the year 0 you'd expect a non-privately-insured person to be paying about $\exp(-278) \approx 10^{-121}$ dollars (?or whatever your unit of cost is).
  • The private insurance differential $\beta_1$ is a huge positive number, but it also applies in year 0, so it's also somewhat nonsensical. The value of 166 means you'd expect someone with private insurance to be paying about $\exp(166) \approx 10^{72}$ times as much for insurance as someone without, in year zero. (Put another way, the expected cost for someone privately insured in year zero is about $\exp(-278+166) \approx 10^{-49}$ dollars.)

These coefficients will be much easier to interpret if you center your year variable, by subtracting the minimum value or the mean (e.g. let your year variable run from 0 to 9 instead of 2010 to 2019).

The other two parameters are a little easier since they don't depend on the zero-point of the year variable.

  • $\beta_2$ is the expected difference in log-costs per year for a non-privately-insured person: 0.142 means a multiplicative increase of $\exp(0.142) \approx 1.153$ per year (small values of $\beta$ can be read approximately as proportional differencs).
  • $\beta_3$ is the difference in slope between privately insured and non-privately-insured people: privately insured people's OOP costs increase slower than non-privately-insured people. They increase at a multiplicative rate of $\exp(0.142-0.082) \approx 1.06$ per year.