Solved – How should Type II SS be calculated in a mixed model

anovamixed modelrsums-of-squares

I have a data set (and corresponding mixed model) which gets very different p-values for one of the two-way interactions when tested using Type I (sequential, taking care that it's last), and Type II (using lmerTest and car packages).

The sequential and car packages "agree" and the lmerTest package is different.

What are these methods doing, why are they different, and which is "right"?

(Note that there are many questions about Type I/II/III here, but none that I see that answer this specific question about Type II. Also, while I'm using R, and am interested in the particular implementations used in R, the bigger question is statistical, not about the coding.)

library(lme4)
library(lmerTest)
library(car)

d <- data.frame(
  Y = c(6, 4, 5, 7, 6, 8, 9, 0, 10, 9, 10, 8, 7, 7, 6, 5, 6, 7, 8, 
        7, 9, 10, 10, 6, 5, 8, 10, 4, 6, 7, 7, 10, 6, 10, 10, 7, 6, 3, 
        6, 7, 5, 7, 8, 9, 11, 10, 7, 8, 5, 6, 6, 7, 8, 5, 4, 8, 8, 8, 
        7, 9, 5, 4, 4, 6, 5, 7, 8, 3, 9, 8, 8, 7, 5, 7, 4, 5, 5, 4, 6, 
        8, 8, 6, 7, 8, 4, 2, 4, 4, 3, 6, 7, 8, 9, 8, 7, 6, 4, 2, 3, 3, 
        3, 0, 1, 4, 6, 4, 2, 0, 9, 9, 10, 8, 10, 9, 7, 9, 6, 5, 9, 7, 
        6, 2, 2, 3, 5, 7, 8, 9, 9, 8, 7, 8, 1, 0, 3, 2, 2, 5, 3, 5, 8, 
        6, 6, 6, 8, 6, 2, 3, 4, 5, 7, 6, 7, 8, 3, 3),
  A = rep(c(2, 5, 4, 4, 2, 3, 3, 5, 9, 3, 7, 2, 11), each=12),
  B = factor(rep(1:2, each=6)),
  C = factor(1:6),
  ID = factor(rep(1:13, each=12))
)
options(contrasts=c("contr.sum","contr.poly")) # for Type III car::Anova

Here's my model, I'm noticing the differences for the B:C interaction.

m1 <- lmer(Y ~ A * B * C + (1|ID), data=d)

Type I Anova (from lmerTest), with B:C interaction last except for A:B:C, gives p=0.04 for B:C. (However, order doesn't seem to matter.)

anova(m1, type=1, ddf="Kenward-Roger")
#> Analysis of Variance Table of type I  with  Kenward-Roger 
#> approximation for degrees of freedom
#>        Sum Sq Mean Sq NumDF DenDF F.value    Pr(>F)    
#> A       8.794   8.794     1    11   2.908   0.11621    
#> B     136.641 136.641     1   121  45.178 6.269e-10 ***
#> C      14.128   2.826     5   121   0.934   0.46141    
#> A:B     0.381   0.381     1   121   0.126   0.72330    
#> A:C    45.874   9.175     5   121   3.034   0.01291 *  
#> B:C    34.821   6.964     5   121   2.303   0.04882 *  
#> A:B:C  21.860   4.372     5   121   1.446   0.21285    

Type II Anova from the car package gives the same results.

Anova(m1, type=2, test="F")
#> Analysis of Deviance Table (Type II Wald F tests with Kenward-Roger df)
#> 
#> Response: Y
#>             F Df Df.res    Pr(>F)    
#> A      2.9075  1     11   0.11621    
#> B     45.1784  1    121 6.269e-10 ***
#> C      0.9343  5    121   0.46141    
#> A:B    0.1259  1    121   0.72330    
#> A:C    3.0335  5    121   0.01291 *  
#> B:C    2.3026  5    121   0.04882 *  
#> A:B:C  1.4456  5    121   0.21285   

Type II Anova from lmerTest, gives p=0.88 for B:C, and is identical with Type III in both lmerTest and car.

anova(m1, type=2, ddf="Kenward-Roger")
#> Analysis of Variance Table of type II  with  Kenward-Roger 
#> approximation for degrees of freedom
#>       Sum Sq Mean Sq NumDF DenDF F.value    Pr(>F)    
#> A      8.794   8.794     1    11  2.9075 0.1162113    
#> B     41.500  41.500     1   121 13.7214 0.0003209 ***
#> C     50.879  10.176     5   121  3.3645 0.0070072 ** 
#> A:B    0.381   0.381     1   121  0.1259 0.7233027    
#> A:C   45.874   9.175     5   121  3.0335 0.0129149 *  
#> B:C    5.174   1.035     5   121  0.3422 0.8864025    
#> A:B:C 21.860   4.372     5   121  1.4456 0.2128521   

Anova(m1, type=3, test="F")
#> Analysis of Deviance Table (Type III Wald F tests with Kenward-Roger df)
#> 
#> Response: Y
#>                   F Df Df.res    Pr(>F)    
#> (Intercept) 88.2434  1     11 1.376e-06 ***
#> A            2.9075  1     11 0.1162113    
#> B           13.7214  1    121 0.0003209 ***
#> C            3.3645  5    121 0.0070072 ** 
#> A:B          0.1259  1    121 0.7233027    
#> A:C          3.0335  5    121 0.0129149 *  
#> B:C          0.3422  5    121 0.8864025    
#> A:B:C        1.4456  5    121 0.2128521    

anova(m1, type=3, ddf="Kenward-Roger")
#> Analysis of Variance Table of type III  with  Kenward-Roger 
#> approximation for degrees of freedom
#>       Sum Sq Mean Sq NumDF DenDF F.value    Pr(>F)    
#> A      8.794   8.794     1    11  2.9075 0.1162113    
#> B     41.500  41.500     1   121 13.7214 0.0003209 ***
#> C     50.879  10.176     5   121  3.3645 0.0070072 ** 
#> A:B    0.381   0.381     1   121  0.1259 0.7233027    
#> A:C   45.874   9.175     5   121  3.0335 0.0129149 *  
#> B:C    5.174   1.035     5   121  0.3422 0.8864025    
#> A:B:C 21.860   4.372     5   121  1.4456 0.2128521    

Best Answer

The difference between the Type II tests from car::Anova and the anova method for lmerTest is due to how continuous explanatory variables are handled. The first passus in the Details section of help(Anova) describes how Anova handles them:

The designations "type-II" and "type-III" are borrowed from SAS, but the definitions used here do not correspond precisely to those employed by SAS. Type-II tests are calculated according to the principle of marginality, testing each term after all others, except ignoring the term's higher-order relatives; so-called type-III tests violate marginality, testing each term in the model after all of the others. This definition of Type-II tests corresponds to the tests produced by SAS for analysis-of-variance models, where all of the predictors are factors, but not more generally (i.e., when there are quantitative predictors). Be very careful in formulating the model for type-III tests, or the hypotheses tested will not make sense.

so while lmerTest implements the SAS definition of Type II as described in this paper, car::Anova does something different. The SAS definition is described here which mean that a test for a term is marginal to all terms that it is not contained in. The definition of containment is given in the SAS document as:

Given two effects F1 and F2, F1 is said to be contained in F2 provided that the following two conditions are met:

  • Both effects involve the same continuous variables (if any).
  • F2 has more CLASS [factors in R] variables than F1 does, and if F1 has CLASS variables, they all appear in F2

So in your model B:C is not contained in A:B:C because A is continuous and the type II hypothesis of B:C is therefore marginal to A:B:C. car::Anova does not distinguish between factors and covariates in this context so if A was a factor car::Anova and lmerTest-anova agrees:

d$A2 <- cut(d$A, quantile(d$A), include.lowest = TRUE)
m2 <- lmer(Y ~ A2 * B * C + (1|ID), data=d)
anova(m2, type=2, ddf="Ken") 
Type II Analysis of Variance Table with Kenward-Roger's method
        Sum Sq Mean Sq NumDF DenDF F value    Pr(>F)    
A2      17.210   5.737     3     9  1.7035   0.23533    
B      136.641 136.641     1    99 40.5756 5.987e-09 ***
C       14.128   2.826     5    99  0.8391   0.52512    
A2:B    11.137   3.712     3    99  1.1024   0.35193    
A2:C    38.816   2.588    15    99  0.7684   0.70879    
B:C     34.821   6.964     5    99  2.0680   0.07575 .  
A2:B:C  50.735   3.382    15    99  1.0044   0.45709    
---
Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
Anova(m2, type=2, test="F")
Analysis of Deviance Table (Type II Wald F tests with Kenward-Roger df)

Response: Y
             F Df Df.res    Pr(>F)    
A2      1.7035  3      9   0.23533    
B      40.5756  1     99 5.987e-09 ***
C       0.8391  5     99   0.52512    
A2:B    1.1024  3     99   0.35193    
A2:C    0.7684 15     99   0.70879    
B:C     2.0680  5     99   0.07575 .  
A2:B:C  1.0044 15     99   0.45709    
---
Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1

The lmerTest::show_tests() function can be used to see exactly which linear function of the model coefficients makes up the tested hypothesis. For example, we have that

show_tests(anova(m1, type=2))$`B:C`
      (Intercept) A B1 C1 C2 C3 C4 C5 A:B1 A:C1 A:C2 A:C3 A:C4 A:C5 B1:C1 B1:C2 B1:C3
B1:C1           0 0  0  0  0  0  0  0    0    0    0    0    0    0     1     0     0
B1:C2           0 0  0  0  0  0  0  0    0    0    0    0    0    0     0     1     0
B1:C3           0 0  0  0  0  0  0  0    0    0    0    0    0    0     0     0     1
B1:C4           0 0  0  0  0  0  0  0    0    0    0    0    0    0     0     0     0
B1:C5           0 0  0  0  0  0  0  0    0    0    0    0    0    0     0     0     0
      B1:C4 B1:C5 A:B1:C1 A:B1:C2 A:B1:C3 A:B1:C4 A:B1:C5
B1:C1     0     0       0       0       0       0       0
B1:C2     0     0       0       0       0       0       0
B1:C3     0     0       0       0       0       0       0
B1:C4     1     0       0       0       0       0       0
B1:C5     0     1       0       0       0       0       0

so this contrast is marginal to all other terms in the model. Because the car::Anova Type II test for B:C is equivalent to the Type I test we can see that this contrast can be expressed as:

show_tests(anova(m1, type=1), fractions = TRUE)$`B:C`
      (Intercept) A     B1    C1    C2    C3    C4    C5    A:B1  A:C1  A:C2  A:C3 
B1:C1     0           0     0     0     0     0     0     0     0     0     0     0
B1:C2     0           0     0     0     0     0     0     0     0     0     0     0
B1:C3     0           0     0     0     0     0     0     0     0     0     0     0
B1:C4     0           0     0     0     0     0     0     0     0     0     0     0
B1:C5     0           0     0     0     0     0     0     0     0     0     0     0
      A:C4  A:C5  B1:C1 B1:C2 B1:C3 B1:C4 B1:C5 A:B1:C1 A:B1:C2 A:B1:C3 A:B1:C4 A:B1:C5
B1:C1     0     0     1   1/2   1/2   1/2   1/2 60/13   30/13   30/13   30/13   30/13  
B1:C2     0     0     0     1   1/3   1/3   1/3     0   60/13   20/13   20/13   20/13  
B1:C3     0     0     0     0     1   1/4   1/4     0       0   60/13   15/13   15/13  
B1:C4     0     0     0     0     0     1   1/5     0       0       0   60/13   12/13  
B1:C5     0     0     0     0     0     0     1     0       0       0       0   60/13  

which is not marginal to the 3-way interaction term.