Solved – Coding of categorical random effects in R: int vs factor

categorical datamixed modelr

I have a problem with coding of a 2-level categorical predictor variable in R, and subsequently using it as a random slope in lmer().

I can keep the factor as numeric, coded using the treatment coding:

> unique (b$multi)
[1] 0 1

Running lmer() using a dataset coded in this way yields:

> l1 = glmer(OK ~ multi + (0 + multi|item) + (1|subject)+ (1|item), family="binomial", data=b)
> summary(l1)
Generalized linear mixed model fit by maximum likelihood (Laplace Approximation) [glmerMod]
 Family: binomial  ( logit )
Formula: OK ~ multi + (0 + multi | item) + (1 | subject) + (1 | item)
   Data: b

     AIC      BIC   logLik deviance df.resid 
  4806.5   4838.9  -2398.3   4796.5     4792 

Scaled residuals: 
    Min      1Q  Median      3Q     Max 
-7.8294 -0.5560 -0.1548  0.5623 14.3342 

Random effects:
 Groups  Name        Variance Std.Dev.
 subject (Intercept) 1.84379  1.3579  
 item    (Intercept) 2.40306  1.5502  
 item.1  multi       0.04145  0.2036  
Number of obs: 4797, groups:  subject, 123; item, 39
[...]

Above there is only one random slope related to multi. However, something very different happens when I convert the variable into a factor:

> b$multi = as.factor(b$multi)
> levels (b$multi)
[1] "0" "1"

When I fit a model using multi as a random slope variable:

l2 = glmer(OK ~ multi + (0+multi|item) + (1|subject)+ (1|item), family="binomial", data=b)
Warning message:
In checkConv(attr(opt, "derivs"), opt$par, ctrl = control$checkConv, :
Model failed to converge: degenerate Hessian with 1 negative eigenvalues

… the model fails to converge and I get a very different random effects structure:

> summary(l2)
Generalized linear mixed model fit by maximum likelihood (Laplace Approximation) [glmerMod]
 Family: binomial  ( logit )
Formula: OK ~ multi + (0 + multi | item) + (1 | subject) + (1 | item)
   Data: b

     AIC      BIC   logLik deviance df.resid 
  4807.8   4853.1  -2396.9   4793.8     4790 

Scaled residuals: 
    Min      1Q  Median      3Q     Max 
-8.3636 -0.5608 -0.1540  0.5627 15.2515 

Random effects:
 Groups  Name        Variance Std.Dev. Corr
 subject (Intercept) 1.8375   1.3555       
 item    (Intercept) 0.9659   0.9828       
 item.1  multi0      1.5973   1.2638       
         multi1      1.0224   1.0111   1.00
Number of obs: 4797, groups:  subject, 123; item, 39
[...]

The number of parameters in the model clearly change (reflected by the change in AIC, etc.), and I get two random slopes.

My question is which way of coding the categorical variable is better? Intuition tells me that it is the first one, but I have seen recommendations for both ways of coding in various tutorials and classes about running GLMMs in R and this is why it baffles me. Both types of the predictor variable work identically in ordinary regression using lm().

Best Answer

The difference is because you're separating the intercept and the slope in the random effect. That's an odd thing to do; the usual way to fit this model would be

OK ~ multi + (multi | item) + (1 | subject)

with multi being a factor.

What happens is that in the first model you get what you expect; the 0+multi|item term gives one parameter and the 1|item term gives one parameter, but in the second model the 0 + multi | item term results in two parameters, which are simply the estimate for each condition. If you take the 1|item term out of that model you should get a result that is equivalent to both your first model and the one I give above, except for differences in parameterization.

Note also the correlation of exactly one in your second model; this is a clue that you've overparameterized it and that one of those parameters is not necessary.

Related Question