Solved – Binomial GLMM: Model validation & ceiling effect

My data has a binary response acc(correct/incorrect), one continuous predictor score, three categorical predictors (race, sex, emotion) and a random factor subj. All predictors are within-subject.

By selecting the random effects first and then the fixed effects, I ended up with this model:
M<-glmer(acc ~ race + sex + emotion + sex:emotion + race:emotion + score +(1+sex|subj), family=binomial, data=subset)

I need help on interpreting validation plots, figuring out if they show a "ceiling effect" in acc, and fix any problems that need to be fixed.

To validate the model I get the residuals and fitted values

 fitted<-predict(M,type="response")
 resid<-resid(M,type="pearson")

And plot the residuals against the categorical predictors

 plot(subset$race,resid)
 plot(subset$sex,resid)
 plot(subset$emotion,resid)

Those three plots show a slight pattern of more negative and dispersed residuals in "easy" conditions. The pattern looks slight to me (i may be wrong).

I plot the residuals against the continuous predictor

 plot(subset$score,resid)

enter image description here

This plot of residuals against the continuous predictor is worrying and shows a clear pattern of more negative and dispersed residuals when score increases (the task becomes easier).

 plot(fitted,resid)

enter image description here

This plot is also worrying showing a clear pattern of more negative and dispersed residuals when the probability of a correct answer increases (either for y=0 or y=1, not sure which one).

Apparently these patterns may simply be coming from the log() in the link function.

I further tried to plot a regression line as shown in here: link.

enter image description here

Supposedly it should be straight.

Are these patterns strong enough to abandon the model? I would think that they are not, since the plots look very much like the ones from the links, except there is a general tendency to predict more "y=1" i gather.

I know there is a ceiling effect in my data, with some easy conditions having almost only correct responses (y=1). This is why I am being maybe overly skeptical about my model. Are these patterns a symptom of this?

Best Answer

This looks fairly reasonable to me; I don't think the inference based on this model is likely to be far off. However, to take a more positive attitude, any deviation in your residuals also implies a chance to improve the model (i.e., there is further information that could be modeled).

Does the full model show the same deviations? That is, even though the variables you've discarded were non-significant, they might help address the (slight) pattern in the residuals.
You might be able to improve the model fit by modifying the link function (or equivalently transforming the predictor variables/linear predictor). In How to assess the fit of a binomial GLMM fitted with lme4 (> 1.0)? , where a similar pattern of residuals is discussed, I show how to construct a power-logit family of link functions that can be used for testing goodness-of-link and/or improving the model. (Existing goodness-of-link tests such as Pregibon's test use linearization and score tests to evaluate goodness of fit in an efficient way by comparing the existing fit to a family of link functions; the procedure at the linked question does the same thing in a much more brute-force way.) You might also find similar families of alternate link functions provided in the glmx package.

Best Answer

Related Solutions

Solved – GLM model validation; Patterns in residuals ? – R

Solved – Non-normality of residuals in a negative binomial GLMM

Related Question