Mixed Models – Equivalent Mixed Models Yielding Different Results in SAS

mixed modelsas

Below I show two equivalent ways to write a mixed model in R and SAS. The two R models as well as the two SAS models yield the same estimates of the random and fixed effects and the same standard errors of the estimates of the fixed effects. But the two SAS models do not give the same confidence intervals of the fixed effects. My question is: which are the correct confidence intervals ?

Here are simulated data:

library(mvtnorm)
I <- 3 
J <- 6 
K <- 5 
n <- I*J*K
set.seed(666)
tube <- rep(1:J, each=I)
position <- rep(LETTERS[1:I], times=J)
Mu_i <- 3*(1:I)
Mu_ij <- c(t(rmvnorm(J, mean=Mu_i, sigma=diag(I)+2)) )  
tube <- rep(tube, each=K)
position <- rep(position, each=K)
Mu_ij <- rep(Mu_ij, each=K)
dat <- data.frame(tube, position)
sigmaw <- 2 
dat$y <- rnorm(n, Mu_ij, sigmaw)
dat$tube <- factor(dat$tube)
> str(dat)
'data.frame':   90 obs. of  3 variables:
 $ tube    : Factor w/ 6 levels "1","2","3","4",..: 1 1 1 1 1 1 1 1 1 1 ...
     $ position: Factor w/ 3 levels "A","B","C": 1 1 1 1 1 2 2 2 2 2 ...
 $ y       : num  2.76 5.5 2.54 1.56 6.46 ...
> head(dat)
  tube position        y
1    1        A 2.759443
2    1        A 5.496689
3    1        A 2.540150
4    1        A 1.558261
5    1        A 6.462050
6    1        B 4.239749

The corresponding nlme model is the following:

> # firstly set position C as the "intercept" for concordance with SAS
> dat$position <- relevel(dat$position, "C")
> # load nlme
> library(nlme)
> # fit the model
> ( fit1 <- lme(y ~ position, data=dat, random= list(tube = pdCompSymm(~ 0+position ))) )
Linear mixed-effects model fit by REML
  Data: dat 
  Log-restricted-likelihood: -199.0196
  Fixed: y ~ position 
(Intercept)   positionA   positionB 
   8.526544   -4.800535   -3.322507 

Random effects:
 Formula: ~0 + position | tube
 Structure: Compound Symmetry
          StdDev   Corr       
positionC 1.892433            
positionA 1.892433 0.082      
positionB 1.892433 0.082 0.082
Residual  1.932750            

Number of Observations: 90
Number of Groups: 6 

This model is equivalent to a 2-way ANOVA with mixed effets (in the sense that the marginal models are the same), which is more easy to fit with lme4:

> library(lme4)
> lmer(y ~ position + (1|tube) + (1|tube:position), data=dat)
Linear mixed model fit by REML 
Formula: y ~ position + (1 | tube) + (1 | tube:position) 
   Data: dat 
 AIC BIC logLik deviance REMLdev
 410 425   -199    402.5     398
Random effects:
 Groups        Name        Variance Std.Dev.
 tube:position (Intercept) 3.28587  1.81270 
 tube          (Intercept) 0.29543  0.54354 
 Residual                  3.73552  1.93275 
Number of obs: 90, groups: tube:position, 18; tube, 6

Fixed effects:
            Estimate Std. Error t value
(Intercept)   8.5265     0.8493  10.039
positionA    -4.8005     1.1595  -4.140
positionB    -3.3225     1.1595  -2.866

Correlation of Fixed Effects:
          (Intr) postnA
positionA -0.683       
positionB -0.683  0.500

Below I check that the results are indeed the same:

> # same standard erros
> sqrt(diag(fit1$varFix))
(Intercept)   positionA   positionB 
  0.8493533   1.1594505   1.1594505 
> # the total variance in the second model is given in the first model:
> sqrt(1.81270^2+ 0.54354^2)
[1] 1.892437

Well. Now here are the two equivalent SAS models:

/* First model */
PROC MIXED DATA=dat ;
CLASS POSITION TUBE ;
MODEL y = POSITION ;
RANDOM POSITION / subject=TUBE type=CS ;
RUN; QUIT;

/* Second model */
PROC MIXED DATA=dat ;
CLASS POSITION TUBE ;
MODEL y = POSITION ;
RANDOM TUBE TUBE*POSITION ;
RUN; QUIT;

Results are identical to the R results. But SAS assigns different degrees of freedom for the fixed effects and consequently gives different confidence intervals, as shown below.

The first model gives degrees of freedom 5, 10, 10:

Effect  position    Estimate    StandardError   DF  Alpha   Lower   Upper
Intercept       8.5265  0.8494  5   0.05    6.3432  10.7099
position    A   -4.8005 1.1595  10  0.05    -7.384  -2.2171
position    B   -3.3225 1.1595  10  0.05    -5.9059 -0.7391
position    C   0   .   .   .   .   .

whereas the second model gives degrees of freedom 15, 15, 15:

Effect  position    Estimate    StandardError   DF  Alpha   Lower   Upper
Intercept       8.5265  0.8494  15  0.05    6.7162  10.3369
position    A   -4.8005 1.1595  15  0.05    -7.2718 -2.3292
position    B   -3.3225 1.1595  15  0.05    -5.7938 -0.8512
position    C   0   .   .   .   .   .

Best Answer

By default, SAS uses a very simple method to compute the degrees of freedom; it looks to see if the fixed effect appears in a term in the random effects; if so, it uses that random effect as the degrees of freedom.

In this case, the model with random effects of TUBE and TUBE*POSITION do the right thing as this method can tell that POSITION should use TUBE*POSITION. But the model with POSITION / subject=TUBE cannot. Instead, one should tell it to compute the degrees of freedom using another method, usually the Satterthwaite (satterth) for models with only random effects and the Kenward-Roger method (kr) for models with repeated effects structures.

PROC MIXED DATA=dat ;
CLASS POSITION TUBE ;
MODEL y = POSITION /s ddfm=satterth;
RANDOM POSITION / subject=TUBE type=CS;
RUN;
Related Question