Solved – Can a variable be included in a mixed model as a fixed effect and as a random effect at the same time

lme4-nlmemixed modelr

I am building a mixed model, where the same 4 groups are tested repeatedly 5 weeks on the measure M. I want to test the effect of group on the measure M. Can I build a model as following (let's say with R):

   lmer(M ~ Time + Group + (1|Group), data = mydata)

where group is both a fixed and random effect at the same time? Imposing two
effects of a group seems contradictory conceptually. But what if I want to test the group effect?

Best Answer

First to make sure Group is defined as a factor. Your current model is $$M_{ij}=\alpha_i+\beta_1\mathrm{Time}_{ij}+u_i+e_{ij},$$ where $i$ denotes Group and $j$ the time points. If you run your model and test with ranef(), you will find that it is hard to distinguish $\alpha_i$ and $u_i$ in the estimation, thus $u_i$ almost equal 0.

Two possible alternative models are:

  1. Random effects model, $$M_{ij}=\beta_0+\beta_1\mathrm{Time}_{ij}+u_i+e_{ij},$$ where $\beta_0$ is the average intercept, $u_i$ is the random individual deviance from the average intercept for each group and is assumed to follow a distribution.
  2. Fixed effects model (in the context of econometrics), $$M_{ij}=\alpha_i+\beta_1\mathrm{Time}_{ij}+e_{ij},$$ where $\alpha_i$ is the fixed individual intercept.

Updates:

I use the sleepstudy data set as an illustration. If the grouping variable (a factor) is included as a covariate (Model fm2 below), both the random effects and its variance tend to be zero. The intuitive explanation is that, $\alpha_i$ and $u_i$ basically model the same quantity (the group specific intercept), although one is assumed fixed and one is random. The majority of the variability is first absorbed by the fixed intercepts ($\alpha_i$), so the random intercepts $u_i$ tend to be all zero. The code and results are listed below.

> fm1 <- lmer(Reaction ~ Days + (1 | Subject), sleepstudy, REML = F)
> re1 = as.matrix(ranef(fm1)$Subject)
> summary(as.vector(re1))
   Min. 1st Qu.  Median    Mean 3rd Qu.    Max. 
-77.570  -7.460   5.701   0.000  15.850  71.920 
> VarCorr(fm1)
 Groups   Name        Std.Dev.
 Subject  (Intercept) 36.012  
 Residual             30.895  
> sd(re1)
[1] 35.76336


> fm2 <- lmer(Reaction ~ Days + Subject + (1 | Subject), sleepstudy, REML = F)
> re2 = as.matrix(ranef(fm2)$Subject)
> summary(as.vector(re2))
   Min. 1st Qu.  Median    Mean 3rd Qu.    Max. 
      0       0       0       0       0       0 
> VarCorr(fm2)
 Groups   Name        Std.Dev.
 Subject  (Intercept)  0.00   
 Residual             29.31   
> sd(re2)
[1] 0

Updates 2:

Previously I used ML in lmer because I found REML variance estimate for the random intercept seems off the true value in this extreme case. I know that it may not make sense to include both fixed and random group-specific intercepts in a single model, but it can be an interesting example. Note that the only different between this model and the random intercept model is a larger design matrix for the fixed effects.

The lmer example below uses REML, and the model is the same as Model fm2 above. The estimated random effects are all close to zero, and their standard deviation is very close to zero. I know the standard deviation of the estimated random effects is not a perfect estimate of the variance of the random effects, but the two should correspond somehow. But the REML variance estimate for the random effects is 33, which is very off zero.

> fm3 <- lmer(Reaction ~ Days + Subject + (1 | Subject), sleepstudy, REML = T, verbose = T)
> re3 = as.matrix(ranef(fm3)$Subject)
> summary(as.vector(re3))
      Min.    1st Qu.     Median       Mean    3rd Qu.       Max. 
-3.998e-12 -9.220e-13 -6.549e-13 -7.717e-13 -4.567e-13  6.893e-13 
> VarCorr(fm3)
 Groups   Name        Std.Dev.
 Subject  (Intercept) 33.050  
 Residual             30.991  
> sd(re3)
[1] 9.391908e-13

I also tested in Stata and it becomes more interesting. The mixed command uses an EM algorithm but it cannot converge and thus gives a very large estimate. In my understanding, REML and ML should not differ so much in this case. There may be some numerical issues. Given that the estimates rely on iterations, I will think more about this when I have more time.

. mixed reaction days i.subject || subject:, reml

Performing EM optimization: 

Performing gradient-based optimization: 

could not calculate numerical derivatives -- discontinuous region with missing values encountered
could not calculate numerical derivatives -- discontinuous region with missing values encountered

Computing standard errors:
standard-error calculation failed

Mixed-effects REML regression                   Number of obs     =        180
Group variable: subject                         Number of groups  =         18

                                                Obs per group:
                                                              min =         10
                                                              avg =       10.0
                                                              max =         10

                                                Wald chi2(18)     =     169.64
Log restricted-likelihood = -805.65036          Prob > chi2       =     0.0000

...

------------------------------------------------------------------------------
  Random-effects Parameters  |   Estimate   Std. Err.     [95% Conf. Interval]
-----------------------------+------------------------------------------------
subject: Identity            |
                  var(_cons) |   105231.5          .             .           .
-----------------------------+------------------------------------------------
               var(Residual) |   960.4566          .             .           .
------------------------------------------------------------------------------
LR test vs. linear model: chibar2(01) = 2.3e-13       Prob >= chibar2 = 1.0000

Warning: convergence not achieved; estimates are based on iterated EM
Related Question