Solved – Modeling reaction time with glmer

glmmlme4-nlmemixed modelr

According to Lo and Andrews, 2015 (https://doi.org/10.3389/fpsyg.2015.01171) raw Reaction Time (RT) should be analyzed with a GLMM, instead of transformed values with LMM or even ANOVA. They and others point out a gamma or inverse gaussian with an identity link function should be used.

My data is from RTs before and after sleep (~2200h and ~0730h). Participants (22) were asked to press a button every time a cross was displayed on the screen. This was measured multiple times on each participant, on each timepoint and in two conditions (placebo and intervention (participants received a stimulus that affects their sleep pattern).

head(data): (trimmed >100ms and <2000ms)

        subjectNumber expDay   age   bmi      weight height treatment waiting reaction   timep
           2            N1     24    22.53     73    180    Control    6026      588      Before sleep
           2            N1     24    22.53     73    180    Control    4470      326      Before sleep
           2            N1     24    22.53     73    180    Control    2334      336      Before sleep
           2            N1     24    22.53     73    180    Control    6005      289      Before sleep
           2            N1     24    22.53     73    180    Control    4636      318      Before sleep
           2            N1     24    22.53     73    180    Control    3515      315      Before sleep

I'm interested in knowing if my intervention improved the reaction time.
It's possible that people have better performance in the morning irregardless of any intervention (less tired), so the model should take this into account.

An interaction term (intervention*timepoint) is needed and will say if intervention had any effect after sleep, seeing that before sleep there should be no difference.

So far my model looks like this:

glmer(reaction ~ treatment * timep + (1|subjectNumber), data=., family = inverse.gaussian(link = "identity"))

summary():

Generalized linear mixed model fit by maximum likelihood (Laplace Approximation) ['glmerMod']
 Family: inverse.gaussian  ( identity )
Formula: reaction ~ treatment * trial + (1 | subjectNumber)
   Data: .

     AIC      BIC   logLik deviance df.resid 
 33283.9  33320.1 -16636.0  33271.9     3064 

Scaled residuals: 
    Min      1Q  Median      3Q     Max 
-3.1321 -0.4575 -0.1539  0.2122 17.8238 

Random effects:
 Groups        Name        Variance  Std.Dev.
 subjectNumber (Intercept) 4.071e+02 20.17610
 Residual                  1.404e-04  0.01185
Number of obs: 3070, groups:  subjectNumber, 20

Fixed effects:
                                      Estimate Std. Error t value Pr(>|z|)    
(Intercept)                            341.394     12.128  28.150  < 2e-16 ***
treatmentIntervention                    4.614      2.709   1.703  0.08860 .  
timepAfter sleep                         7.745      2.763   2.803  0.00507 ** 
treatmentStimulation:trialAfter sleep   -3.636      3.895  -0.933  0.35058    
---
Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1

Correlation of Fixed Effects:
            (Intr) trtmnS trlAfs
trtmntStmlt -0.089              
trilAftrslp -0.083  0.467       
trtmntSt:As  0.048 -0.685 -0.701

Residuals vs fitted

The AIC is ridiculously high when compared to other RT models I've seen (range of 100–300), so with my short experience with these models I'm unsure about how good is the fit.

Is it correct to say, according to the above, that the treatment has no effect on RT, but sleep in general improves RT irregardless of treatment? I'm afraid this sleep effect may confound things in a way as to make this design invalid in the first place.

Best Answer

The high AIC value is not a problem in itself: the AIC is a measure of the relative quality of a model; it says something about how good is the fit of a model but only with respect to another model fit on the same dataset. You cannot compare it to other RT models if these are not fit on the same dataset because - for example - the AIC will depend also on the number of observation.

The residual plot looks fine to me: the fitted values are discretized in specific values because of the categorical nature of your predictor variables (time and treatment). More specifically, because your predictors are 2 categorical variables with 2 levels each, the model can only predict a different value for each combination of the two variables, hence it will output 4 different values for each subject. It would be different if you had also a continuous variable as predictors, e.g. the number of hours slept.

The only issue with the residual plot may be that the distribution of residuals seems slightly skewed toward positive values. This is due to the fact that reaction times distributions are, almost always, skewed, and could be fixed in principle with a parametric transformation of the dependent variable, such as the Box-Cox power transform. However since you prefer to avoid transformations, you may want to try also to fit the data with the gamma link function, to see if it the distribution of residuals improve. You can choose the preferred model (gamma vs. inverse Gaussian) based on the AIC, as the two models are fit on the same dataset.

Contrary to the previous answer, I think it can make sense to include in the model the interaction term treatment:timep (or at least to test it, then eventually remove it afterward if it does not improve the fit). Without the interaction, the model can only predict a purely additive combination of the effect of sleep and treatment: in other words, the model assumes that the effect of treatment is the same regardless of whether the improvement in reaction times due to sleep was large or small. This assumption, however, may not be true: for example, the effect of treatment may be larger for those subjects that show smaller benefit due to sleep alone (this would make sense if you consider that there may be a ceiling effect in the reaction time benefit, i.e. if reaction times cannot improve more than a certain amount). The interaction term treatment:timep would allow you to model such effects.

One last comment, your model only includes random intercepts for each subject. This means that while the marginal mean of the RTs is allowed to vary across participants, the effects of sleep or treatments are not. If you suspect that these effects may vary largely across subjects, it may be worth trying to fit random slopes for these effects. For example, if you want to model individual differences in the effect of sleep as a random effect, the random effect part of the formula would be (1+timep|subjectNumber). Of course, this will increase the complexity of the model, and the number of parameters estimated, and may results in issues during the estimation of the parameters (e.g. convergence warnings). You could start from the 'maximal' model, with all possible random effects included (the formula would be (treatment * timep | subjectNumber)) and then proceed by simplyfying it if the estimation does not converge. For some advice on how to chose the right level of complexity given the data, I suggest you check this article by Bates & colleagues.