Solved – Singular fit with simplest random structure in glmer (lme4)

glmmlme4-nlmemixed model

I am trying to run mixed models (logistic regression) on a dataframe with the glmer function from lme4 but I always receive this message: "boundary (singular) fit: see ?isSingular"

Even if I create a model with just an intercept and and the simplest random part (random intercept for one factor), the variance for this random factor is 0.

 Family: binomial  ( logit )
Formula: PointGagneparleServeur ~ 1 + (1 | Tour)
   Data: DataModel_Logit_allRF_AusOpen
      AIC       BIC    logLik  deviance  df.resid 
 480.7822  488.5765 -238.3911  476.7822       362 
Random effects:
 Groups Name        Std.Dev.
 Tour   (Intercept) 0       
Number of obs: 364, groups:  Tour, 6
Fixed Effects:
(Intercept)  
     0.5639  
convergence code 0; 1 optimizer warnings; 0 lme4 warnings

Though I have observations for all the values of the factor :

table(DataModel_Logit_allRF_AusOpen$PointGagneparleServeur,DataModel_Logit_allRF_AusOpen$Tour)

    1erTour 2emeTour 3emeTour 8eme Quart Demi
  0      26       24       12   35    20   15
  1      40       36       37   59    32   28

and the dependent variable PointGagneparleServeur is actually numeric.

(FYI, i recently "upgraded" my os to Catalina 10.15. Experiencing several bugs with other(non programming) softwares since. So, I am mentionning it just in case it could play a role…)

Does anyone have an idea on why I have this issue ?

Best Answer

Indeed, your data do not support that hypothesis that there is significant variation in the outcome between the levels of the grouping factor.

library(lme4)
library(tidyverse)

dat <- data.frame(f = rep(letters[1:6], c(26 + 40, 24 + 36, 12 + 37, 35 + 59, 20 + 32, 15 + 28)),
                  y = rep(c(0, 1, 0, 1, 0, 1, 0, 1, 0, 1, 0, 1), c(26, 40, 24, 36, 12, 37, 35, 59, 20, 32, 15, 28)))

fit <- glmer(y ~ 1 + (1 | f), family = binomial, data = dat)

summary(fit)

> summary(fit)
Generalized linear mixed model fit by maximum likelihood (Laplace
  Approximation) [glmerMod]
 Family: binomial  ( logit )
Formula: y ~ 1 + (1 | f)
   Data: dat

     AIC      BIC   logLik deviance df.resid 
   480.8    488.6   -238.4    476.8      362 

Scaled residuals: 
    Min      1Q  Median      3Q     Max 
-1.3257 -1.3257  0.7543  0.7543  0.7543 

Random effects:
 Groups Name        Variance Std.Dev.
 f      (Intercept) 0        0       
Number of obs: 364, groups:  f, 6

Fixed effects:
            Estimate Std. Error z value Pr(>|z|)    
(Intercept)   0.5639     0.1090   5.173 2.31e-07 ***
---
Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
convergence code: 0
boundary (singular) fit: see ?isSingular

This is because you don't have enough data to conclude that the groups significantly differ with respect to the outcome measure:

> dat %>% group_by(f) %>% summarize(m = mean(y))
# A tibble: 6 x 2
  f         m
  <fct> <dbl>
1 a     0.606
2 b     0.6  
3 c     0.755
4 d     0.628
5 e     0.615
6 f     0.651

Notice all the means are fairly close together. Given the total number of observations, and the number of observations in each group, and most critically, the number of groups (only 6) it's hard to say variation exists. I recommend looking into rstarnarm or brms to fit the analogous model from a fully Bayesian point of view. (However, this is my opinion, and there is by no means consensus in the statistical community on how to deal with this problem, see ?isSingular)

library(rstanarm)
fit2 <- stan_glmer(y ~ 1 + (1 | f), family = binomial, data = dat)
summary(fit2)

> fit2
stan_glmer
 family:       binomial [logit]
 formula:      y ~ 1 + (1 | f)
 observations: 364
------
            Median MAD_SD
(Intercept) 0.6    0.1   

Error terms:
 Groups Name        Std.Dev.
 f      (Intercept) 0.24    
Num. levels: f 6 

------
* For help interpreting the printed output see ?print.stanreg
* For info on the priors used see ?prior_summary.stanreg

Related Solutions

Solved – Level-2 predictions with lme4/glmer model

I'm not 100% sure I know what you mean by the levels: according to the usual way I've seen this terminology used, level 1 would be "above" level 2, meaning the level of the whole population, so I'm not sure how we can have a "level-1 predictor". Anyway, I'm not sure I need to know, since you can set the fixed effects however you like within newdata. I think the answer to your question is in the help:

ReForm: formula for random effects to condition on. If ‘NULL’, include all random effects; if ‘NA’ or ‘~0’, include no random effects

so ReForm=NA gives population-level predictions (i.e. predictions based on not knowing what ID is being predicted); since you have only a single random effect, using either ReForm=~ID or ReForm=NULL will give predictions conditional on specified IDs. (I see you have set allow.new.levels=TRUE; I'm not sure how that will work with predicting at the ID level ...)

With the development version of lme4:

d <- data.frame(f=factor(rep(LETTERS[1:20],each=30)))
library(lme4)
d$y <- simulate(~1+(1|f),family="gaussian",newdata=d,
    newparams=list(beta=0,theta=1,sigma=0.1),seed=102)[[1]]
m <- lmer(y~1+(1|f),data=d)
newdata <- data.frame(f=factor(LETTERS[1:20]))
predict(m,newdata=newdata,ReForm=NA)  ## all identical
predict(m,newdata=newdata,ReForm=NULL)  ## different by f

(I'm not sure, but the capitalization of ReForm may have changed in the development version -- be careful.)

update: OK, you want to know the average probability of a student at school $j$ repeating a class. I think your approach is reasonable (the answer should be similar to the observed value, although in general it should be a shrinkage estimator [i.e. closer to the overall average). You might also want to consider calculating the probability that an average student at school $j$ would repeat, in which case you would first average the predictors ...

R lme4-nlme – Resolving Errors in glmer Model Implementation

We use random effects to encode information about structure in the data that implies that observations are not independent. For example:

We have measurements from different sites in the same geographic area and we expect that two measurements from the same area are more similar than measurements from two different areas (with the same values for the predictor variables). We can deal with this by adding area random effects.
We have repeated measurements from the same site at various points in time and we expect that measurements from the same site at two different times are more similar than measurements from different sites. We can deal with this by adding site random effects.

You don't have this level of information as you seem to be putting together a meta-dataset by combining data collected under different conditions.

It's not meaningful to conceptualize the different conditions as "random effects". As @EdM advises, it would be better that you treat them as fixed effects. This simplifies the model and it'll be easier to deal with other errors, if any.

No matter what model you choose, if you don't have information about the structure of the data (which location were measurements collected from? at what time?), you cannot model correlations between observations appropriately. And if you assume that observations are independent when they are in fact correlated, the inference from your model won't be quite right: p-values too small, confidence intervals too narrow. Be aware of this limitation and don't overinterpret the results.