Solved – lmer model simplification

lme4-nlmemodelr

I am trying to do model simplification looking at how different factors may affect distance. So I have snails kept in several habitats and I wanted to see if that affects how closely another snail may follow that snail. So I start off with this model:

  model1 <- lmer(sqrt(dist+6)~  (1|snail)+food+stress+food:stress+
       weight+OriginalL+FollowedL)
summary(model1)

and the summary is this:

  Linear mixed model fit by REML ['lmerMod']
  Formula: sqrt(dist + 6) ~ (1 | snail) + food + stress + food:stress +  
weight + OriginalL + FollowedL

REML criterion at convergence: 561.1

Scaled residuals: 
    Min      1Q  Median      3Q     Max 
-2.2941 -0.7698 -0.3347  0.7515  1.9564 

Random effects:
 Groups   Name        Variance Std.Dev.
 snail    (Intercept) 0.000    0.000   
 Residual             2.334    1.528   
Number of obs: 148, groups:  snail, 37

Fixed effects:
                               Estimate Std. Error t value
(Intercept)                    4.960927   0.662947   7.483
foodSweetPotato               -0.219039   0.357768  -0.612
stressshelter                 -0.246649   0.355999  -0.693
weight                         0.002520   0.063259   0.040
OriginalL                      0.015549   0.013072   1.189
FollowedL                     -0.008044   0.005972  -1.347
foodSweetPotato:stressshelter -0.300143   0.503215  -0.596

Correlation of Fixed Effects:
            (Intr) fdSwtP strsss weight OrgnlL FllwdL
foodSwetPtt -0.309                                   
stressshltr -0.315  0.502                            
weight      -0.615  0.008  0.009                     
OriginalL   -0.617 -0.021  0.032  0.123              
FollowedL   -0.470  0.118  0.059  0.087 -0.004       
fdSwtPtt:st  0.230 -0.707 -0.708 -0.008 -0.024 -0.055

Should I remove the least significant factor or remove the interactions first?

And after this is it a simple anova between my first model and most simplified model?

Best Answer

A very short answer:

these questions aren't really specific to mixed models, they apply generally to simplification/model selection for any form of linear model or related framework.
in general, it doesn't make sense to worry at all about inference, or selection of, main effects when there are interactions involving those main effects in the model; this is called the principle of marginality (sorry, that Wikipedia page is a mess, but it gives you a little more information ...), so the narrow-sense answer to your question would be to always consider removing interactions first, and as a corollary to never consider removing main effects if an interaction that involves them is retained in the model.
stepwise model selection, while still very popular, has some major problems; you should consider whether you really want to drop terms from your model or not ... see e.g.
- Flom, Peter L., and David L. Cassell. “Stopping Stepwise: Why Stepwise and Similar Selection Methods Are Bad, and What You Should Use.” In NorthEast SAS Users Group Inc 20th Annual Conference: 11-14th November 2007; Baltimore, Maryland, 2007. http://denversug.org/presentations/2010coday/stopsteppresntn.pdf.
- Harrell, Frank Regression Modeling Strategies (Springer), or see the Stata FAQ for an abbreviated version

I'm not sure what you mean by "is it a simple anova between my first model and most simplified model"? If you want to do inference on the terms in the model, you can use a likelihood ratio test (implemented via anova() in R), or an F test, or ...

Best Answer

Related Solutions

Solved – Lmer model fails to converge

lme4 – Random Effect Specification in lmer Mixed Effect Model Using R

Related Question