It comes to my attention that there is a discrepancy between lmerTest and Proc Mixed for the numerator degrees of freedom (df) for type II and III ANOVA. This only occurs for mixed model without fixed intercept.
The following is the dataset:
input9 <-structure(list(block = structure(c(1L, 1L, 1L, 1L, 1L, 1L, 2L,
2L, 2L, 2L, 2L, 2L, 3L, 3L, 3L, 3L, 3L, 3L, 4L, 4L, 4L, 4L, 4L,
4L), .Label = c("1", "2", "3", "4"), class = "factor"), a = structure(c(1L,
1L, 2L, 2L, 3L, 3L, 1L, 1L, 2L, 2L, 3L, 3L, 1L, 1L, 2L, 2L, 3L,
3L, 1L, 1L, 2L, 2L, 3L, 3L), .Label = c("1", "2", "3"), class = "factor"),
b = structure(c(1L, 2L, 1L, 2L, 1L, 2L, 1L, 2L, 1L, 2L, 1L,
2L, 1L, 2L, 1L, 2L, 1L, 2L, 1L, 2L, 1L, 2L, 1L, 2L), .Label = c("1",
"2"), class = "factor"), y = c(56, 41, 50, 36, 39, 35, 30,
25, 36, 28, 33, 30, 32, 24, 31, 27, 15, 19, 30, 25, 35, 30,
17, 18)), label = "MIXED9 ", row.names = c(NA,
-24L), class = "data.frame")
Here is the SAS code:
proc mixed data = work.input9 method = ml;
class a b block;
model y = a b a*b / noint solution DDFM=SATTERTHWAITE HTYPE=1,2,3;
random block a*block;
lsmeans a b a*b / pdiff cl;
run;
Here is the SAS output in R:
> anova9_I_exp_t
effect numdf dendf fvalue probf
1 a 3 5.59999999999989 28.21481701385756 0.000866319611832041
2 b 1 11.99999999999997 25.85163204747710 0.000268656242407431
3 a*b 2 11.99999999999999 5.35905044510368 0.021719775545852811
> anova9_II_exp_t
effect numdf dendf fvalue probf
1 a 2 7.99999999999956 5.42609899619229 0.032427388825803982
2 b 1 11.99999999999997 25.85163204747710 0.000268656242407431
3 a*b 2 11.99999999999998 5.35905044510368 0.021719775545852839
> anova9_III_exp_t
effect numdf dendf fvalue probf
1 a 2 7.99999999999957 5.42609899619229 0.032427388825803989
2 b 1 11.99999999999997 25.85163204747711 0.000268656242407431
3 a*b 2 11.99999999999998 5.35905044510368 0.021719775545852839
Here is the R code and R output:
> results9 <- lmer(y ~ -1 + a + b + a*b + (1|block) + (1|interaction(a, block, lex.order = T)), data = input9, REML=F, control=lmerControl(optimizer="bobyqa"))
> anova9_I <- anova(results9,type="I")
> anova9_II <- anova(results9,type="II")
> anova9_III <- anova(results9,type="III")
> anova9_I
Type I Analysis of Variance Table with Satterthwaite's method
Sum Sq Mean Sq NumDF DenDF F value Pr(>F)
a 594.2746442207 198.0915480736 3 5.600000533736 28.21482 0.00086632 ***
b 181.5000000000 181.5000000000 1 11.999999840688 25.85163 0.00026866 ***
a:b 75.2500000000 37.6250000000 2 11.999999840262 5.35905 0.02171978 *
---
Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
> anova9_II
Type II Analysis of Variance Table with Satterthwaite's method
Sum Sq Mean Sq NumDF DenDF F value Pr(>F)
a 496.7108658511 165.5702886171 3 5.795106995423 23.58271 0.00118483 **
b 181.5000000000 181.5000000000 1 11.999999840688 25.85163 0.00026866 ***
a:b 75.2500000000 37.6250000000 2 11.999999840262 5.35905 0.02171978 *
---
Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
> anova9_III
Type III Analysis of Variance Table with Satterthwaite's method
Sum Sq Mean Sq NumDF DenDF F value Pr(>F)
a 676.9713224343 225.6571074781 3 5.789659050233 32.14107 0.00051883 ***
b 181.5000000000 181.5000000000 1 11.999999840688 25.85163 0.00026866 ***
a:b 75.2500000000 37.6250000000 2 11.999999840262 5.35905 0.02171978 *
---
Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
So the values for Type I ANOVA match, but the values for Type II and III ANOVA does not. Specifically, only the values for the first factor/effect (a) is different.
I take a closer look at the code in SAS and R(lmerTest). It turns out the hypothesis contrast matrix (L) is set up differently between SAS and R(lmerTest). For R(lmerTest), all three types of ANOVA consider a1,a2 and a3 (there are three levels in a). For SAS, Type I uses a1,a2 and a3; Type II and Type II uses a1 and a2. This causes the numerator df to be different.
lmerTest and PROC MIXED constructs L matrix differently when the fixed intercept is removed from the model. Which one is correct?
Potentially useful documents:
The paper in this link (Appendix B.2,B.3 and B.4) describes how the hypothesis contrast matrix is formulated in lmerTest. This link mentions how SAS performs Type I,II and III ANOVA.
Best Answer
For what it is worth, emmeans agrees with SAS on this. The Type III tests are:
Using a new addition not yet present in the CRAN version, we can see the associated estimable functions:
These are exactly what you would expect.
The type II tests (via
jt2 = joint_tests(results9, mode = "satterth", submodel = "type2")
) are identical to the type III ones here, due to the balance.One interesting thing to me is that the Keward-Roger method yields fairly different (and more conservative) results. This is due to the adjustment that K-R applies to the covariance matrix.
One explanation I can think of is that SAS and
joint_tests()
both use joint tests of estimable linear functions of the regression coefficients; hence they both explicitly address the question of estimating contrasts related to each effect. However, the appendices of the lmerTest article in JSS (linked in the OP) refer to manipulations of the columns of the design matrix. The way that R constructs the design matrix for this model, there are 3 columns associated witha
and therefore the intercept effect becomes part of thea
effect in its calculations. In other words, there is an implicit assumption that design-matrix columns associated with predictors are all associated with contrasts; and that isn't the case in the model without intercept.