Solved – Results of Type-3 Wald Chi-Square Different for GLMM with Different Contrast Coding

categorical-encodingcontrastsgeneralized linear modellme4-nlmer

I have just completed a multilevel, longitudinal logistic regression testing, at four different time points, whether participants in an experimental group are more likely to have committed any drug-related crime in the previous 4 weeks than participants receiving placebo.

The syntax for the glmer looks like this

sumDrug <- summary(drugMod <- glmer(drugCrime ~ group*week + (1|id), data = oti, family = binomial, control = glmerControl(optimizer = "bobyqa"), nAGQ = 10))

Where drugCrime is the binary outcome variable "have committed one or more drug-related crimes in previous 4 weeks?" (Y/N response), group is a factor indicating group allocation, and week is a factor indicating time of measurement (at weeks 0, 4, 8, and 12).

If I treatment code the two factors (see here) the output of the regression (with Odds Ratios and CIs) looks like this:

           var Estimate Std..Error z.value Pr...z..   OR   lo   hi
1  (Intercept)   -1.234      0.562  -2.197    0.028 0.29 0.04 2.28
2       group2   -0.108      0.779  -0.139    0.890 0.90 0.11 7.02
3        week2   -1.450      0.632  -2.295    0.022 0.23 0.03 1.84
4        week3   -0.965      0.652  -1.480    0.139 0.38 0.05 2.98
5        week4   -0.696      0.652  -1.067    0.286 0.50 0.06 3.90
6 group2:week2   -1.366      1.011  -1.351    0.177 0.26 0.03 2.00
7 group2:week3   -0.822      1.001  -0.821    0.411 0.44 0.06 3.44
8 group2:week4   -2.580      1.136  -2.271    0.023 0.08 0.01 0.59

And the output of the Anova(drugMod) in the car package returns

Analysis of Deviance Table (Type III Wald chisquare tests)

Response: drugCrime
             Chisq Df Pr(>Chisq)  
(Intercept) 4.8248  1    0.02805 *
group       0.0193  1    0.88955  
week        5.5199  3    0.13745  
group:week  5.4146  3    0.14383 

If, on the other hand, I simple code the two factors (also see here) the output of the regression looks like this:

    var Estimate Std..Error z.value Pr...z..   OR   lo   hi 
1  (Intercept)   -2.661      0.499  -5.328    0.000 0.07 0.01
2       group1   -1.300      0.742  -1.752    0.080 0.27 0.03
3        week2   -2.133      0.548  -3.890    0.000 0.12 0.02
4        week3   -1.376      0.532  -2.584    0.010 0.25 0.03
5        week4   -1.986      0.600  -3.311    0.001 0.14 0.02
6 group1:week2   -1.366      1.011  -1.351    0.177 0.26 0.03
7 group1:week3   -0.822      1.001  -0.821    0.411 0.44 0.06
8 group1:week4   -2.580      1.136  -2.271    0.023 0.08 0.01 

this time the output of the Anova(drugMod) is

Analysis of Deviance Table (Type III Wald chisquare tests)

Response: drugCrime
              Chisq Df Pr(>Chisq)    
(Intercept) 28.3890  1  9.923e-08 ***
group        3.0684  1  0.0798276 .  
week        17.8728  3  0.0004672 ***
group:week   5.4149  3  0.1438151 

I understand why the regression outputs are different with the two different contrast coding schemes, but why am I getting two different Anova() outputs for these two different coding schemes, and which is the 'correct' one to report? In a reply to this post @Ben Bolker quotes the Anova() help, saying "Be very careful in formulating the model for type-III tests, or the hypotheses tested will not make sense." But I am not sure what 'makes sense' means in this context.

Best Answer

You have to be a little careful with car:Anova(..., type=3) since it only performs Type III tests if the sum-to-zero constrast coding (contr.sum) is used and if there are no "missing cells" or other estimability issues (to borrow a term from linear model theory).

The contrasts that car:Anova(..., type=3) produce is what I would call marginal contrasts and also what you get from nlme::anova.lme(..., type="marginal"). In "nice" cases these contrasts correspond to Type III contrasts if contr.sum is used.

My advice is therefore to:

  1. Look at your data to check if there are any missing cells (use table(group, week)). Estimability issues often show up as NA in summary coefficient tables if you use lm but glmer may not display them. From the looks of your model summary you are probably ok.

  2. Set the contrast-coding to contr.sum, fit the model, obtain the Type III tests:

    options(contrasts = c("contr.sum", "contr.poly")) drugMod <- glmer(drugCrime ~ group*week + (1|id), ....) car::Anova(drugMod, type=3)

  3. It is possible to obtain the Type III tests using contrasts of Least-Square Means (or in this case their GLM generalisation). Russell Lenth has added extensive guides to his emmeans package and this vignette describes how to obtain the type III tests in this way. I would compare these with the results from car::Anova. Note that this approach will also not yield Type III if there are missing cells or estimability issues, but emmeans warns you if it detects such problems (and I think this is also implemented for glmer fits).

This very recommendable vignette has a very insightful discussion of Type III contrasts. It termed the approach in car:Anova(..., type=3) Not-Safe-Type-Three and it mentions that the instruction to use contr.sum is in the car-book (see footnote on page 15). It also has a description of which hypotheses are actually being tested if you use contr.treatment or contr.SAS and why these are probably not hypotheses that you are actually interested in (this addresses the "will not make sense" quote you mention).

Edit in order to answer: "what is 'missing cells' and how to detect them?"

As an example of a real data set with missing cells, consider the soup data from the ordinal package. Note the 0-entry in the following tabulation:

data("soup", package="ordinal")
soup$RESPONSE <- as.numeric(as.character(soup$SURENESS))
with(soup, table(DAY, PRODID)) 
   PRODID
DAY   1   2   3   4   5   6
  1 369  94 184  93  88  93
  2 370 275   0  92  97  92

If we fit a model with the interaction between PRODID and DAY, one coefficient is not estimable which shows up as NA:

fm <- lm(RESPONSE ~ DAY * PRODID, data=soup)
summary(fm)

Call:
lm(formula = RESPONSE ~ DAY * PRODID, data = soup)

Residuals:
    Min      1Q  Median      3Q     Max 
-4.2043 -1.4486  0.8043  1.3818  2.5514 

Coefficients: (1 not defined because of singularities)
             Estimate Std. Error t value Pr(>|t|)    
(Intercept)   3.94580    0.09421  41.883  < 2e-16 ***
DAY2         -0.49715    0.13314  -3.734 0.000194 ***
PRODID2       0.67122    0.20908   3.210 0.001349 ** 
PRODID3       1.11398    0.16332   6.821 1.23e-11 ***
PRODID4       0.59183    0.20998   2.819 0.004876 ** 
PRODID5       0.92920    0.21469   4.328 1.58e-05 ***
PRODID6       1.25850    0.20998   5.993 2.47e-09 ***
DAY2:PRODID2  0.49831    0.25392   1.962 0.049861 *  
DAY2:PRODID3       NA         NA      NA       NA    
DAY2:PRODID4  0.51386    0.29756   1.727 0.084347 .  
DAY2:PRODID5  0.62215    0.29784   2.089 0.036854 *  
DAY2:PRODID6  0.48850    0.29756   1.642 0.100823    
---
Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1

Residual standard error: 1.81 on 1836 degrees of freedom
Multiple R-squared:  0.1022,    Adjusted R-squared:  0.09735 
F-statistic: 20.91 on 10 and 1836 DF,  p-value: < 2.2e-16

In general any source of non-estimability is a problem but R goes a long way to eliminate such problems in the way design matrices are constructed. I can think of two relevant sources of non-estimability: missing cells and perfectly correlated covariates and most people already automatically avoid the latter.

If there were no missing cells the Type III test of DAY would test the difference between days at an average PRODID (i.e., a simple, flat, unweighted average). This is exactly the Yates contrast that Therneau discusses, but in the presence of missing cells this straight-forward interpretation no longer holds. You might want to stay clear of Type III tests in such cases due to lack of interpretability...

Related Question