I am running a glmm with a binomial response variable and a categorical predictor. The random effect is given by the nested design used for the data collection. The data looks like this:
m.gen1$treatment
[1] sucrose control protein control no_injection .....
Levels: no_injection control sucrose protein
m.gen1$emergence
[1] 1 0 0 1 0 1 1 1 1 1 1 0 0....
> m.gen1$nest
[1] 1 1 1 2 2 3 3 3 3 4 4 4 .....
Levels: 1 2 3 4 5 6 8 10 11 13 15 16 17 18 20 22 24
The first model I run looks like this
m.glmm.em.<-glmer(emergence~treatment + (1|nest),family=binomial,data=m.gen1)
I get two warnings that look like this:
Warning messages:
1: In checkConv(attr(opt, "derivs"), opt$par, ctrl = control$checkConv, :
Model failed to converge with max|grad| = 0.0240654 (tol = 0.001, component 4)
2: In checkConv(attr(opt, "derivs"), opt$par, ctrl = control$checkConv, :
Model is nearly unidentifiable: large eigenvalue ratio
- Rescale variables?
The model summary shows that one of the treatments has a unusually large standard error, which you can see here:
Fixed effects:
Estimate Std. Error z value Pr(>|z|)
(Intercept) 2.565 1.038 2.472 0.0134 *
treatmentcontrol -1.718 1.246 -1.378 0.1681
treatmentsucrose 16.863 2048.000 0.008 0.9934
treatmentprotein -1.718 1.246 -1.378 0.1681
I tried the different optimizers from glmer control and functions from other packages, and I get a similar output. I have run the model using glm ignoring the random effect, and the problem persist. While exploring the data I realized that the treatment with a high Std. error has only successes in the response variable. Just to check whether that could be causing the problem I added a fake data point with a "failure" for that treatment and the model runs smoothly, and gives reasonable standard error. You can see that here:
Fixed effects:
Estimate Std. Error z value Pr(>|z|)
(Intercept) 3.4090 1.6712 2.040 0.0414 *
treatmentcontrol -1.8405 1.4290 -1.288 0.1978
treatmentsucrose -0.2582 1.6263 -0.159 0.8738
treatmentprotein -2.6530 1.5904 -1.668 0.0953 .
I was wondering if my intuition is right about the lack of failures for that treatment preventing a good estimation, and how can I work around this issue.
Thanks in advance!
Best Answer
Your intuition is exactly right. This phenomenon is called complete separation. You can find quite a lot (now that you know its name) Googling around ... It is fairly thoroughly discussed here in a general context, and here in the context of GLMMs. The standard solution to this problem is to add a small term that pushes the parameters back toward zero -- in frequentist contexts this is called a penalized or bias-corrected method. The standard algorithm is due to Firth (1993, "Bias reduction of maximum likelihood estimates" Biometrika 80, 27-38), and is implemented in the logistf package on CRAN. In Bayesian contexts this is framed as adding a weak prior to the fixed-effect parameters.
To my knowledge Firth's algorithm hasn't been extended to GLMMs, but you can use the Bayesian trick by using the blme package, which puts a thin Bayesian layer over the top of the
lme4
package. Here's an example from the above-linked GLMM discussion:The first two lines in this example are exactly the same as we would use in the standard
glmer
model; the last specifies that the prior for the fixed effects is a multivariate normal distribution with a diagonal variance-covariance matrix. The matrix is 4x4 (because we have 4 fixed-effect parameters in this example), and the prior variance of each parameter is 9 (corresponding to a standard deviation of 3, which is pretty weak -- that means +/- 2SD is (-6,6), which is a very large range on the logit scale).The very large standard errors of the parameters in your example are an example of a phenomenon closely related to complete separation (it occurs whenever we get extreme parameter values in a logistic model) called the Hauck-Donner effect.
Two more potentially useful references (I haven't dug into them yet myself):
A more recent Google scholar search for "bglmer 'complete separation'" finds: