I am having tough time interpreting the output of my GLM model with Gamma family and log link function. My dependent variable if "Total Out-of-pocket cost" and my independent variables are "Private health insurance(yes/no)", "year of diagnosis" and "interaction with private health insurance and year". I am trying to find the trend over a period of 4 years in Out-of-pocket costs depending on the insurance status. Also please tell me what does +ve and -ve coeffiecients mean.
My code and the output is as mentioned below.
Thank you very much for the help.
Call:
glm(formula = total_oop ~ private_insur2 + year + private_insur2 *
year, family = Gamma(link = "log"), data = dfq5.1)
Deviance Residuals:
Min 1Q Median 3Q Max
-3.2932 -1.2051 -0.5681 0.2311 4.8237
Coefficients:
Estimate Std. Error t value Pr(>|t|)
(Intercept) -278.75702 128.19627 -2.174 0.0298 *
private_insur2Yes 166.72653 150.45167 1.108 0.2680
year 0.14184 0.06370 2.227 0.0261 *
private_insur2Yes:year -0.08207 0.07475 -1.098 0.2725
---
Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
(Dispersion parameter for Gamma family taken to be 1.911381)
Null deviance: 2631.2 on 1399 degrees of freedom
Residual deviance: 2098.6 on 1396 degrees of freedom
AIC: 24676
Number of Fisher Scoring iterations: 7
```
Best Answer
The formula for the predicted mean value in your regression is
$$ \textrm{total_oop} = \exp \left(\beta_0 + \beta_1 \cdot \textrm{PI} + \beta_2 \cdot\textrm{year} + \beta_3 \cdot \textrm{PI} \cdot\textrm{year} \right) $$
where PI is a dummy variable equal to 0 if someone doesn't have private insurance and 1 if they do.
These coefficients will be much easier to interpret if you center your year variable, by subtracting the minimum value or the mean (e.g. let your year variable run from 0 to 9 instead of 2010 to 2019).
The other two parameters are a little easier since they don't depend on the zero-point of the year variable.