Mixed Models – Equivalent Mixed Models Yielding Different Results in SAS

mixed modelsas

Below I show two equivalent ways to write a mixed model in R and SAS. The two R models as well as the two SAS models yield the same estimates of the random and fixed effects and the same standard errors of the estimates of the fixed effects. But the two SAS models do not give the same confidence intervals of the fixed effects. My question is: which are the correct confidence intervals ?

Here are simulated data:

library(mvtnorm)
I <- 3 
J <- 6 
K <- 5 
n <- I*J*K
set.seed(666)
tube <- rep(1:J, each=I)
position <- rep(LETTERS[1:I], times=J)
Mu_i <- 3*(1:I)
Mu_ij <- c(t(rmvnorm(J, mean=Mu_i, sigma=diag(I)+2)) )  
tube <- rep(tube, each=K)
position <- rep(position, each=K)
Mu_ij <- rep(Mu_ij, each=K)
dat <- data.frame(tube, position)
sigmaw <- 2 
dat$y <- rnorm(n, Mu_ij, sigmaw)
dat$tube <- factor(dat$tube)
> str(dat)
'data.frame':   90 obs. of  3 variables:
 $ tube    : Factor w/ 6 levels "1","2","3","4",..: 1 1 1 1 1 1 1 1 1 1 ...
     $ position: Factor w/ 3 levels "A","B","C": 1 1 1 1 1 2 2 2 2 2 ...
 $ y       : num  2.76 5.5 2.54 1.56 6.46 ...
> head(dat)
  tube position        y
1    1        A 2.759443
2    1        A 5.496689
3    1        A 2.540150
4    1        A 1.558261
5    1        A 6.462050
6    1        B 4.239749

The corresponding nlme model is the following:

> # firstly set position C as the "intercept" for concordance with SAS
> dat$position <- relevel(dat$position, "C")
> # load nlme
> library(nlme)
> # fit the model
> ( fit1 <- lme(y ~ position, data=dat, random= list(tube = pdCompSymm(~ 0+position ))) )
Linear mixed-effects model fit by REML
  Data: dat 
  Log-restricted-likelihood: -199.0196
  Fixed: y ~ position 
(Intercept)   positionA   positionB 
   8.526544   -4.800535   -3.322507 

Random effects:
 Formula: ~0 + position | tube
 Structure: Compound Symmetry
          StdDev   Corr       
positionC 1.892433            
positionA 1.892433 0.082      
positionB 1.892433 0.082 0.082
Residual  1.932750            

Number of Observations: 90
Number of Groups: 6

This model is equivalent to a 2-way ANOVA with mixed effets (in the sense that the marginal models are the same), which is more easy to fit with lme4:

> library(lme4)
> lmer(y ~ position + (1|tube) + (1|tube:position), data=dat)
Linear mixed model fit by REML 
Formula: y ~ position + (1 | tube) + (1 | tube:position) 
   Data: dat 
 AIC BIC logLik deviance REMLdev
 410 425   -199    402.5     398
Random effects:
 Groups        Name        Variance Std.Dev.
 tube:position (Intercept) 3.28587  1.81270 
 tube          (Intercept) 0.29543  0.54354 
 Residual                  3.73552  1.93275 
Number of obs: 90, groups: tube:position, 18; tube, 6

Fixed effects:
            Estimate Std. Error t value
(Intercept)   8.5265     0.8493  10.039
positionA    -4.8005     1.1595  -4.140
positionB    -3.3225     1.1595  -2.866

Correlation of Fixed Effects:
          (Intr) postnA
positionA -0.683       
positionB -0.683  0.500

Below I check that the results are indeed the same:

> # same standard erros
> sqrt(diag(fit1$varFix))
(Intercept)   positionA   positionB 
  0.8493533   1.1594505   1.1594505 
> # the total variance in the second model is given in the first model:
> sqrt(1.81270^2+ 0.54354^2)
[1] 1.892437

Well. Now here are the two equivalent SAS models:

/* First model */
PROC MIXED DATA=dat ;
CLASS POSITION TUBE ;
MODEL y = POSITION ;
RANDOM POSITION / subject=TUBE type=CS ;
RUN; QUIT;

/* Second model */
PROC MIXED DATA=dat ;
CLASS POSITION TUBE ;
MODEL y = POSITION ;
RANDOM TUBE TUBE*POSITION ;
RUN; QUIT;

Results are identical to the R results. But SAS assigns different degrees of freedom for the fixed effects and consequently gives different confidence intervals, as shown below.

The first model gives degrees of freedom 5, 10, 10:

Effect  position    Estimate    StandardError   DF  Alpha   Lower   Upper
Intercept       8.5265  0.8494  5   0.05    6.3432  10.7099
position    A   -4.8005 1.1595  10  0.05    -7.384  -2.2171
position    B   -3.3225 1.1595  10  0.05    -5.9059 -0.7391
position    C   0   .   .   .   .   .

whereas the second model gives degrees of freedom 15, 15, 15:

Effect  position    Estimate    StandardError   DF  Alpha   Lower   Upper
Intercept       8.5265  0.8494  15  0.05    6.7162  10.3369
position    A   -4.8005 1.1595  15  0.05    -7.2718 -2.3292
position    B   -3.3225 1.1595  15  0.05    -5.7938 -0.8512
position    C   0   .   .   .   .   .

Best Answer

By default, SAS uses a very simple method to compute the degrees of freedom; it looks to see if the fixed effect appears in a term in the random effects; if so, it uses that random effect as the degrees of freedom.

In this case, the model with random effects of TUBE and TUBE*POSITION do the right thing as this method can tell that POSITION should use TUBE*POSITION. But the model with POSITION / subject=TUBE cannot. Instead, one should tell it to compute the degrees of freedom using another method, usually the Satterthwaite (satterth) for models with only random effects and the Kenward-Roger method (kr) for models with repeated effects structures.

PROC MIXED DATA=dat ;
CLASS POSITION TUBE ;
MODEL y = POSITION /s ddfm=satterth;
RANDOM POSITION / subject=TUBE type=CS;
RUN;

Related Solutions

Solved – Comparing mixed-effects and fixed-effects models (testing significance of random effects)

Technically, you can get it working by just switching the order of the parameters:

> anova(fit.me, fit.fe)

Will work just fine. If you pass an object generated by lmer first, the anova.merMod will be called instead of anova.lm (which does not know how to handle lmer objects). See:

?anova.merMod

Although, choosing a mixed model or a fixed model is a modeling choice which needs to take into account the experimental design, not a model selection problem. See @BenBolker's https://bbolker.github.io/mixedmodels-misc/glmmFAQ.html#testing-significance-of-random-effects for more details:

Consider not testing the significance of random effects.

Predictive Models – Usefulness of Mixed Models

It depends on the nature of the data, but in general I would expect the mixed model to outperform the fixed-effects only models.

Let's take an example: modelling the relationship between sunshine and the height of wheat stalks. We have a number of measurements of individual stalks, but many of the stalks are measured at the same sites (which are similar in soil, water and other things that may affect height). Here are some possible models:

1) height ~ sunshine

2) height ~ sunshine + site

3) height ~ sunshine + (1|site)

We want to use these models to predict the height of new wheat stalks given some estimate of the sunshine they will experience. I'm going to ignore the parameter penalty you would pay for having many sites in a fixed-effects only model, and just consider the relative predictive power of the models.

The most relevant question here is whether these new data points you are trying to predict are from one of the sites you have measured; you say this is rare in the real world, but it does happen.

A) New data are from a site you have measured

If so, models #2 and #3 will outperform #1. They both use more relevant information (mean site effect) to make predictions.

B) New data are from an unmeasured site

I would still expect model #3 to outperform #1 and #2, for the following reasons.

(i) Model #3 vs #1:

Model #1 will produce estimates that are biased in favour of overrepresented sites. If you have similar numbers of points from each site and a reasonably representative sample of sites, you should get similar results from both.

(ii) Model #3 vs. #2:

Why would model #3 be better that model #2 in this case? Because random effects take advantage of shrinkage - the site effects will be 'shrunk' towards zero. In other words, you will tend to find less extreme values for site effects when it is specified as a random effect than when it is specified as a fixed effect. This is useful and improves your predictive ability when the population means can reasonably be thought of as being drawn from a normal distribution (see Stein's Paradox in Statistics). If the population means are not expected to follow a normal distribution, this might be a problem, but it's usually a very reasonable assumption and the method is robust to small deviations.

[Side note: by default, when fitting model #2, most software would use one of the sites as a reference and estimate coefficients for the other sites that represent their the deviation from the reference. So it may appear as though there is no way to calculate an overall 'population effect'. But you can calculate this by averaging across predictions for all of the individual sites, or more simply by changing the coding of the model so that coefficients are calculated for every site.]