Mixed Models – Dealing with Singular Fit in Mixed Models

glmmlme4-nlmeoverfittingrsingular

Let's say we have a model

mod <- Y ~ X*Condition + (X*Condition|subject)

# Y = logit variable  
# X = continuous variable  
# Condition = values A and B, dummy coded; the design is repeated 
#             so all participants go through both Conditions  
# subject = random effects for different subjects 

summary(model)
Random effects:
 Groups  Name             Variance Std.Dev. Corr             
 subject (Intercept)      0.85052  0.9222                    
         X                0.08427  0.2903   -1.00            
         ConditionB       0.54367  0.7373   -0.37  0.37      
         X:ConditionB     0.14812  0.3849    0.26 -0.26 -0.56
Number of obs: 39401, groups:  subject, 219

Fixed effects:
                 Estimate Std. Error z value Pr(>|z|)    
(Intercept)       2.49686    0.06909   36.14  < 2e-16 ***
X                -1.03854    0.03812  -27.24  < 2e-16 ***
ConditionB       -0.19707    0.06382   -3.09  0.00202 ** 
X:ConditionB      0.22809    0.05356    4.26 2.06e-05 ***

Here we observe a singular fit, because the correlation between intercept and x random effects is -1. Now, according to this helpful link one way to deal with this model is to remove higher-order random effects (e.g., X:ConditionB) and see whether that makes a difference when testing for singularity. The other is to use the Bayesian approach, e.g., the blme package to avoid singularity.

What is the preffered method and why?

I am asking this because using the first or the second one leads to different results – in the first case, I will remove the X:ConditionB random effect and won't be able to estimate the correlation between X and X:ConditionB random effects. On the other hand, using blme allows me to keep X:ConditionB and to estimate the given correlation. I see no reason why I should even use the non-bayesian estimations and remove random effects when singular fits occur when I can estimate everything with the Bayesian approach.

Can someone explain to me the benefits and problems using either method to deal with singular fits?

Thank you.

Best Answer

When you obtain a singular fit, this is often indicating that the model is overfitted – that is, the random effects structure is too complex to be supported by the data, which naturally leads to the advice to remove the most complex part of the random effects structure (usually random slopes). The benefit of this approach is that it leads to a more parsimonious model that is not over-fitted.

However, before doing anything, do you have a good reason for wanting X, Condition and their interaction, all to vary by subject in the first place ? Does the theory of how the data are generated suggest this ?

If you desire to fit the model with the maximal random effects structure, and lme4 obtains a singular fit, then fitting the same model in a Bayesian framework might very well inform you why lme4 had problems, by inspecting trace plots and how well the various parameter estimates converge. The advantage in taking the Bayesian approach is that by doing so you may uncover a problem with original model ie. the reason why the maximum random effects structure isn’t supported by the data) or it might uncover why lme4 is unable to fit the model. I have encountered situations where a Bayesian model does not converge well, unless informative priors are used – which may or may not be OK.

In short, both approaches have merit.

However, I would always start from a place where the initial model is parsimonious and informed by expert domain knowledge to determine the most appropriate random effects structure. Specifying grouping variables is relatively easy, but random slopes usually don’t have to be included. Only include them if they make sound theoretical sense AND they are supported by the data.

Edit: It is mentioned in the comments that there are sound theoretical reasons to fit the maximal random effects structure. So, a relatively easy way to proceed with an equivalent Bayesian model is to swap the call to glmer with stan_glmer from the rstanarm package – it is designed to be plug and play. It has default priors, so you can quickly get a model fitted. The package also has many tools for assessing convergence. If you find that all the parameters have converging to plausible values, then you are all good. However there can be a number of issues – for example a variance being estimated at or below zero, or an estimate that continues to drift. The mc-stan.org site has a wealth of information and a user forum.

Best Answer

Related Solutions

Solved – Encoding of categorical variables (dumthe vs. effects coding) in mixed models

Related Question