Solved – Varying group coefficients in lme4

lme4-nlmemultilevel-analysis

All,

I am estimating a multilevel logistic regression with group predictors, but am unclear about some of the advice given by Gelman and Hill (2007) in their book. Therein, they recommend allowing every coefficient to possibly vary, given a large enough N. Does that include group predictors as well? They weren't clear, treating "varying slope" as just another complexity you can incorporate into a mixed effects model in lme4 along with group predictors (see: p. 549 in their book).

For example, I have roughly 50,000 observations with a binary response (plenty large N). Predictors exist at two levels, such that my model looks like:

M1 <- lmer(Y ~ X1 + X2 + X3 + X4 + G1 + G2 + G3 + G4 + (1 | group), family=binomial(link="logit"))

X1:X4 are individual-level predictors and G1:G4 are group-level predictors, thus: a multilevel model. Does their recommendation of treating all coefficients as potentially variable mean including even the group predictors within the random effect, such that:

M2 <- lmer(Y ~ X1 + X2 + X3 + X4 + G1 + G2 + G3 + G4 + (1 + X1 + X2 + X3 + X4 + G1 + G2 + G3 + G4 | group), family=binomial(link="logit"))

I ran M2 and it gave sensible estimates. AIC/BIC suggest much better fit than M1. I'm just unsure if it's appropriate since, unlike individual-level predictors, the group-level predictors are not going to vary in a given group. It will obviously vary across groups, though.

Further, if this is not an incorrect way to approach it, how suspicious should I be if one of the group predictors of interest is statistically insignificant as a stand-alone fixed effect (varying intercept model like M1), but is significant as a fixed effect in a varying slope model like M2?

Thanks for any input and feedback on this topic. I really appreciate it.

Best Answer

First of all, AIC/BIC do not make sense in mixed models. I mean, if you can explain what your $n$ that goes into your BIC is (number of groups? number of observations? something in between? how about level 1 and level 2 variables that obviously have different amount information in them?)... So I wouldn't pay any attention to these.

Second, I am surprised your model with the random effects for group level variables was identified at all. Let us think about an extreme case: a binary group level variable in the a model lmer(Y ~ X + G + (1 + X + G| group). What is it that it describes? That a group has an additional random shift when G==1, i.e., group-level heteroskedasticity. So that appears to be something rather odd to estimate.

So all in all, I would run this as

M2 <- lmer(Y ~ X1 + X2 + X3 + X4 + G1 + G2 + G3 + G4 + (1 + X1 + X2 + X3 + X4 | group), family=binomial(link="logit"))

i.e., only with an individual level covariates having random effects assigned to them.