I am trying to run mixed models (logistic regression) on a dataframe with the glmer function from lme4 but I always receive this message: "boundary (singular) fit: see ?isSingular"
Even if I create a model with just an intercept and and the simplest random part (random intercept for one factor), the variance for this random factor is 0.
Family: binomial ( logit )
Formula: PointGagneparleServeur ~ 1 + (1 | Tour)
Data: DataModel_Logit_allRF_AusOpen
AIC BIC logLik deviance df.resid
480.7822 488.5765 -238.3911 476.7822 362
Random effects:
Groups Name Std.Dev.
Tour (Intercept) 0
Number of obs: 364, groups: Tour, 6
Fixed Effects:
(Intercept)
0.5639
convergence code 0; 1 optimizer warnings; 0 lme4 warnings
Though I have observations for all the values of the factor :
table(DataModel_Logit_allRF_AusOpen$PointGagneparleServeur,DataModel_Logit_allRF_AusOpen$Tour)
1erTour 2emeTour 3emeTour 8eme Quart Demi
0 26 24 12 35 20 15
1 40 36 37 59 32 28
and the dependent variable PointGagneparleServeur is actually numeric.
(FYI, i recently "upgraded" my os to Catalina 10.15. Experiencing several bugs with other(non programming) softwares since. So, I am mentionning it just in case it could play a role…)
Does anyone have an idea on why I have this issue ?
Best Answer
Indeed, your data do not support that hypothesis that there is significant variation in the outcome between the levels of the grouping factor.
This is because you don't have enough data to conclude that the groups significantly differ with respect to the outcome measure:
Notice all the means are fairly close together. Given the total number of observations, and the number of observations in each group, and most critically, the number of groups (only 6) it's hard to say variation exists. I recommend looking into
rstarnarm
orbrms
to fit the analogous model from a fully Bayesian point of view. (However, this is my opinion, and there is by no means consensus in the statistical community on how to deal with this problem, see?isSingular
)