Solved – Model specification for glmer (lme4) with varying slope

lme4-nlmemixed model

I am estimating a mixed model using lme4. I need to have varying intercept terms (because I am post-stratifying my results to census categories).

However, I also want to add a random slope term for the effect of income (allowing it to vary by state). This is the model that I estimate:

dependent.var ~ (1 | state) + (1 | race) + (1 | female) + (1 | age) + (1 | edu) + (1 | income) + (1 | region) + (1 + income | state) + percent.dem.vote + state.avg.income

Question: Is it appropriate since I include both (1|income) and (1+income|state)?

I understand it is calculating a random intercept for income, and it is also calculating random intercepts and slopes for income for each state. I am not interested in evaluating the coefficients on the variables, but rather that the model is able to produce a valid prediction of the dependent variable for different combinations of the demographics in the model.

Is there bias in estimation for a multilevel model if both a random intercept and a random slope/intercept term are included in the model?

Best Answer

To answer this question, I think that it's important here to note why the random effects are needed in the first place. If you have repeated measurements on the same individuals, or if the observations are clustered in some other sense (students sampled from a few different school, trees from the some areas, etc), your observations are not independent of each other. Since OLS regression assumes that your observations are independent, it ignores this dependency, which can lead to biased estimates.

So in order to handle this, we either need to specify a correct correlation matrix for the error term (as opposed to assuming that it is iid normal), or we need to model the correlation through use of random intercepts and slopes.

Typically, a mixed effects model would be specificed in the following way:

$Y_{it} = \beta_0 + b_{0i} + \beta_1 x_{1it} + b_{1i} x_{1it} + \beta_2 x_{2it} + b_{2i} x_{2it} + \varepsilon_{it}$

where $Y_{it}$ is the outcome for individual $i$ at measurement occasion $t$, $x_{1it}$ and $x_{2it}$ are individual $i$s covariates at occasion $t$. $\beta_0, \beta_1$ and $\beta_2$ are the (fixed) intercepts and slopes, and the random effects are the $b_{0i}, b_{1i}$ and $b_{2i}$, which are assumed to be multivariate normal and independent of the error term $\varepsilon_{it}$ as well as the covariates.

Now, if the only thing that makes the observations correlated over time, is that some individuals have a higher starting point than others (higher intercept) then a random intercept model is sufficient. However, if some individuals respond to the covariates differently, we also need the random slopes. So the choice of which random effects you include in your model depends on your assumptions about the data. That said, there is an argument for specifying a maximal amount of random effects in your model in either way (Barr et al., 2013) since the cost of having too many random effects is often times lower than the cost of missing one.

As for your specific example, I think you are modelling the random effects in a strange way. For instance, assuming that the income variable is a continuous variable in your data, then having a random intercept for income doesn't make much sense, just has having a dummy variable for every level of income wouldn't make sense either. If what causes the dependency in your data is that the observations come from the same state and region, then a better way to specify the random effects in lmer (from the lme4) package in R, is something like this for the random intercept model:

lmer(dependent.var ~ female + race + age + edu + income + percent.dem.vote + state.avg.income + (1 | region) + (1 | state))

If you want random slopes too, just add them in a similar fashion. Here's an example with random slopes for age and edu, on the region level, added to the model:

lmer(dependent.var ~ female + race + age + edu + income + percent.dem.vote + state.avg.income + (1 + edu + income| region) + (1 | state))

References

Barr, D. J., Levy, R., Scheepers, C., & Tily, H. J. (2013). Random effects structure for confirmatory hypothesis testing: Keep it maximal. Journal of memory and language, 68(3), 255-278.

Related Solutions

Solved – Varying group coefficients in lme4

First of all, AIC/BIC do not make sense in mixed models. I mean, if you can explain what your $n$ that goes into your BIC is (number of groups? number of observations? something in between? how about level 1 and level 2 variables that obviously have different amount information in them?)... So I wouldn't pay any attention to these.

Second, I am surprised your model with the random effects for group level variables was identified at all. Let us think about an extreme case: a binary group level variable in the a model lmer(Y ~ X + G + (1 + X + G| group). What is it that it describes? That a group has an additional random shift when G==1, i.e., group-level heteroskedasticity. So that appears to be something rather odd to estimate.

So all in all, I would run this as

M2 <- lmer(Y ~ X1 + X2 + X3 + X4 + G1 + G2 + G3 + G4 + (1 + X1 + X2 + X3 + X4 | group), family=binomial(link="logit"))

i.e., only with an individual level covariates having random effects assigned to them.

Multilevel Analysis – Fitting Slopes-as-Outcomes Linear Mixed Model Using lme4

This may be a late answer but I will try to answer it however. In pages 117-130 of Raudenbush and Bryk (2002), as you rightly mentioned, the slope as outcome model has been represented as

$$ MATHACH_{ij} = β_{0j} + β_{1j} SES + r_{ij} $$

$$ β_{0j} = γ_{00} + γ_{01}(SCHTYPE) + γ_{02}(MEANSES) + u_{0j} $$

$$ β_{1j} = γ_{10} + γ_{11}(SCHTYPE) + γ_{12}(MEANSES) + u_{1j} $$

This translates to a a 2 level HLM model that can be written in lmer as

lmer(mathach ~ sector + meanses + ses + sector:ses + meanses:ses + ...

The authors explain the significant coefficients in Page 128 which are nothing but the interaction effects coefficients.

I came across a webpage that describes Bayesian meta-regression where they model the slopes of the hierarchical model, which is what you had in mind perhaps.

Best Answer

References

Related Solutions

Solved – Varying group coefficients in lme4

Multilevel Analysis – Fitting Slopes-as-Outcomes Linear Mixed Model Using lme4

Related Question