R Model Selection – How to Perform Model Selection in Glmer

generalized linear modelmixed modelmodel selection

Having troubles to perform a model selection for glmer in R. I'm using the package lme4 with the following structure:

    glo_mo <- glmer(aban ~ year + hab + wlv + gra + cov + (1|lodge), 
              data = aban, family='binomial',
              na.action = na.omit)

    ```
str(aban)
Classes ‘spec_tbl_df’, ‘tbl_df’, ‘tbl’ and 'data.frame':    67 obs. of  9 variables:
 $ lodge  : chr  "2" "52" "34" "39" ...
 $ year   : Factor w/ 2 levels "1","2": 1 1 1 1 1 1 1 1 1 1 ...
 $ hab    : chr  "for" "for" "for" "for" ...
 $ wlv    : num  7 1 NA NA 4 NA NA -4 44 NA ...
 $ dlv    : num  5 NA NA NA 7 NA NA 2 4 NA ...
 $ gra    : num  3 0 0 0 3 NA 0 8 5 4 ...
 $ cov    : num  3.92 16.46 1.78 1.25 2.48 ...
 $ for_str: num  4.4 4.06 3.65 5.54 4.14 5.69 8.61 5.84 6.23 4.36 ...
 $ aban   : Factor w/ 2 levels "0","1": 1 2 1 2 1 2 2 2 1 2 ...

When I run the model:

    summary(glo_mo)
    Generalized linear mixed model fit by maximum likelihood (Laplace Approximation) [glmerMod]
 Family: binomial  ( logit )
Formula: aban ~ year + hab + wlv + gra + cov + (1 | lodge)
   Data: aban

     AIC      BIC   logLik deviance df.resid 
    76.4     89.7    -31.2     62.4       42 

Scaled residuals: 
    Min      1Q  Median      3Q     Max 
-1.7283 -1.1100  0.5375  0.7449  1.4179 

Random effects:
 Groups Name        Variance Std.Dev.
 lodge  (Intercept) 0.09585  0.3096  
Number of obs: 49, groups:  lodge, 32

Fixed effects:
             Estimate Std. Error z value Pr(>|z|)
(Intercept) -0.360995   0.824027  -0.438    0.661
year2        0.605911   0.650404   0.932    0.352
habstep     -0.340842   0.926110  -0.368    0.713
wlv          0.005414   0.009677   0.559    0.576
gra          0.032089   0.086737   0.370    0.711
cov          0.023428   0.022942   1.021    0.307

Correlation of Fixed Effects:
        (Intr) year2  habstp wlv    gra   
year2   -0.239                            
habstep -0.470  0.033                     
wlv     -0.127 -0.051 -0.155              
gra     -0.666 -0.130  0.411  0.313       
cov     -0.130 -0.074 -0.647  0.185 -0.170

Then, I tried to standarize and use the function dredge to automatically select best models, but this last one did not work. The following error mistake

stad <- standardize(glo_mo, standardize.y=F)
options(na.action = "na.fail")
mset <- dredge(stad)

Error in dredge(glo_mo) : 'global.model' uses 'na.action' = "na.omit"

So that blocks me to continue to the selection model. Based on my previous steps and with the aim to select best models,
1. What is wrong in my script?

  1. Also, Is AIC the only parameter to select the best models? Do I have to run each of the model combinations to select the best one, or can I apply function dredge or steps to do that?

  2. What are the other options to select best models in glmer with lme4(or other recommend it packages)?

Best Answer

You have too few observations to include in your initial model that many predictors. Also, note that for binary data the effective sample size is determined by the minimum of the frequencies of the zeros and the ones. Hence, you have very little information in your data to obtain any meaningfully stable results.

Finally, as noted in the comments by EdM, model selection, especially with that small sample size, can be very dangerous. It would be best to just report the results from your full model.

Related Question