Solved – Interpretation of Interaction coefficients in GLM using categorical predictors (in R)

categorical datageneralized linear modelinteraction

I ran a Gamma GLM using 3 categorical predictors:

  1. Year – with 4 classes
  2. Organ – with 3 classes
  3. Site – with 3 classes

My response variable is Biomass.

My model is:

GLM <- glm(biom ~ fyear + organ + site + year:organ + year:site + organ:site,
       data = data, family = Gamma(link = "log"))

The summary(GLM) gives me this (3 coefficients are not defined because of singularities):

                  Estimate   Std.Error t value Pr(>|t|)  
(Intercept)        3.34408    0.39101   8.552 7.89e-14 ***
year2             -0.55195    0.29480  -1.872  0.06382 .  
year3              0.65445    0.29480   2.220  0.02847 *  
year4             -0.20425    0.29616  -0.690  0.49186    
organ2             1.62266    0.39846   4.072 8.80e-05 ***
organ3             2.64728    0.33840   7.823 3.40e-12 ***
site2              1.01485    0.53400   1.900  0.05999 . 
site3              0.41056    0.52632   0.780  0.43703    
year2:organ2       0.03728    0.29480   0.126  0.89959    
year3:organ2      -0.03519    0.29480  -0.119  0.90520    
year4:organ2      -2.03455    0.30021  -6.777 6.34e-10 ***
year2:organ3          NA         NA      NA       NA    
year3:organ3          NA         NA      NA       NA    
year4:organ3          NA         NA      NA       NA    
year2:site2        0.78444    0.36105   2.173  0.03195 *  
year3:site2       -0.01524    0.36105  -0.042  0.96641    
year4:site2        0.28738    0.37216   0.772  0.44166    
year2:site3        1.04849    0.36105   2.904  0.00445 ** 
year3:site3        0.08768    0.36105   0.243  0.80858    
year4:site3        0.71053    0.36105   1.968  0.05159 .  
organ2:site2      -1.41692    0.48655  -2.912  0.00435 ** 
organ3:site2      -1.59445    0.48655  -3.277  0.00140 ** 
organ2:site3      -0.86975    0.47763  -1.821  0.07133 .  
organ3:site3      -1.30913    0.47763  -2.741  0.00715 ** 
  1. The first coefficient (3.34408) is the intercept, so it stands for biomass for the year 1, site 1 and organ 1.
  2. The second one (-0.55195) is the difference between the mean biomass of the year 2 and year 1.
  3. The third one (0.65445) is the difference between the mean biomass of the year 3 and year 1.
  4. the 4th (-0.20425) is the difference between the mean biomass of the year 4 and year 1.
  5. The 5th (1.62266) is the difference between the mean biomass on organ 2 and organ 1.

…and so on until the 8th coefficient
After the main coefficients start the interactions.

  1. What is their interpretation? YEAR2*ORGAN2 is the difference between what?

  2. In addition, year 3 significantly differs from year 1 while year 4 doesn't. What happens between year 3 and year 4? Do they significantly differ?

Best Answer

Your interpretation should be slightly different:

  • The second one (-0.55195) is the difference between the mean biomass of the year 2 and year 1 For observations with intercept values on other categories, namely organ1 and year1.

  • The third one (0.65445) is the difference between the mean biomass of the year 3 and year 1 For observations with intercept values on other categories, namely organ1 and year1

  • etc.

The interaction effect is the difference in main effect with other categories on other variables. So the eight one (YEAR2*ORGAN2) is the difference between observations with year1 and organ1 versus year2 and organ2 (in addition to the main effect).

About your second question: you can't really answer that based on this table, since all first categories are the reference group. I think you should run a pairwise comparison of groups to answer that question.