Solved – linear regression vs linear mixed effect model coefficients

linearmixed modelregression

It is my understanding that linear regression models and linear mixed effect regression models will produce the same regression coefficients (i.e., fixed effects); however, linear regression models produce downwardly biased standard errors leading to inflated Type I error (Cohen, Cohen, Aiken, & West, 2003). Yet, I have a dataset where the linear regression and mixed model coefficients are orders of magnitude different and I do not understand why. The regressions have only one predictor and I estimate a random effect for just the intercept in the linear mixed effect regression model. Does anyone know the conditions under which the model coefficients will be discrepant?

As requested by a comment, here is my R code and output as well as the dataset attached. Notice the linear regression slope is twice the linear mixed effect model fixed slope and the intercepts have different signs!

lm1 <- lm(Y ~ X, data = d); lm1$coefficients
(Intercept)    X 
  -1.132507    1.184904 
lmer1 <- lmer(Y ~ X + (1 | ID), data = d); lmer1@beta
[1] 1.6767616 0.6376439

ID
1.00
1.00
1.00
2.00
2.00
2.00
3.00
3.00
3.00
4.00
4.00
4.00
5.00
5.00
5.00
6.00
6.00
6.00
7.00
7.00
7.00
8.00
8.00
8.00
9.00
9.00
9.00
10.00
10.00
10.00
11.00
11.00
11.00
12.00
12.00
12.00
13.00
13.00
13.00
14.00
14.00
14.00
15.00
15.00
15.00
16.00
16.00
16.00
17.00
17.00
17.00
18.00
18.00
18.00
19.00
19.00
19.00
20.00
20.00
20.00

Y
1.00
2.00
3.00
5.00
4.00
6.00
7.00
8.00
9.00
2.00
3.00
4.00
5.00
5.00
6.00
7.00
6.00
8.00
3.00
4.00
2.00
1.00
2.00
1.00
5.00
6.00
4.00
7.00
8.00
9.00
8.00
8.00
7.00
6.00
4.00
2.00
4.00
5.00
6.00
6.00
7.00
5.00
3.00
4.00
2.00
1.00
2.00
3.00
4.00
2.00
3.00
5.00
6.00
4.00
7.00
8.00
6.00
9.00
8.00
9.00

X
3.00
4.00
3.00
6.00
4.00
6.00
6.00
8.00
5.50
4.00
3.00
5.50
5.00
7.00
5.50
7.00
4.50
6.00
4.00
3.00
4.00
2.50
4.00
3.00
6.00
6.00
6.50
7.00
8.00
7.00
7.00
5.50
6.00
6.50
4.00
4.00
3.50
5.00
4.00
5.50
7.00
4.50
4.50
6.00
5.50
2.00
3.00
6.00
3.00
4.50
3.00
5.00
6.00
3.00
7.50
7.50
5.50
6.50
7.00
6.00

Best Answer

I don't know that I can give a rigorous theoretical explanation, but a picture may make things clearer:

The blue line is the OLS fit, the gray line is the population-level prediction for the mixed model. The individual lines are predicted lines (all equal slopes, randomly varying intercepts) for each ID.
Since there is some correlation between the mean values of X and Y for each group, some of the variability that would go into the slope is instead taken out by the random intercept term.
The apparently large difference in the intercepts is partly caused by extrapolation (the data starts at X=2, the intercept refers to the expected value at X=0).

d <- data.frame(ID=factor(rep(1:20,each=3)),
                Y=c(1,2,3,5,4,6,7,8,9,2,3,4,5,5,6,7,6,
                    8,3,4,2,1,2,
                    1,5,6,4,7,8,9,8,8,7,6,4,
                    2,4,5,6,6,7,5,3,4,2,1,2,
                    3,4,2,3,5,6,4,7,8,6,9,8,9),
                X=c(3,4,3,6,4,6,6,8,5.5,4,3,5.5,5,7,5.5,7,4.5,6,4,
                    3,4,2.5,4,3,6,6,6.5,7,8,7,7,5.5,6,6.5,4,4,3.5,
                    5,4,5.5,7,4.5,4.5,6,5.5,2,3,6,3,4.5,3,5,6,3,
                    7.5,7.5,5.5,6.5,7,6))

lm1 <- lm(Y ~ X, data = d)
library(lme4)
lmer1 <- lmer(Y ~ X + (1 | ID), data = d)
ff <- fixef(lmer1)
## get predictions
pp <- d
pp$Y <- predict(lmer1)
library(dplyr)
pp <- pp %>%
    group_by(ID) %>%
    filter(Y %in% range(Y))

library(ggplot2); theme_set(theme_bw())
ggplot(d,aes(X,Y,colour=ID))+
    geom_point()+
    scale_colour_discrete(guide=FALSE)+
    geom_line(data=pp)+
    scale_x_continuous(limits=c(0,8))+
    geom_smooth(method="lm",aes(group=1),fullrange=TRUE)+
    geom_abline(slope=ff["X"],intercept=ff["(Intercept)"],
                colour="darkgray",lwd=1.5)
ggsave("CV161703.png")

Related Solutions

Solved – Mixed effect linear regression model output interpretation

Your QQ-plot shows heavy tails which suggests that your data/observations are not coming from a normal distribution. While rescaling might still be relevant here, what you really need to do is to first try and transform your dependent variables "price" (e.g., to log(price)) to make your observations closer to normal. Otherwise, you cannot trust your model fit and errors.

You might want to look at the BoxCox transformations to understand which transformation is appropriate for your data.

Otherwise you might want to consider non-linear mixed models that better fit your model.

Mixed Effect Model – When and How to Use It

I'm afraid I might have the nuanced and perhaps unsatisfying answer that it is a subjective choice by the researcher or data analyst. As mentioned elsewhere in this thread, it isn't enough to simply say the data have a "nested structure." To be fair, though, this is how many books describe when to use multilevel models. For example, I just pulled Joop Hox's book Multilevel Analysis off of my bookshelf, which gives this definition:

A multilevel problem concerns a population with a hierarchical structure.

Even in a pretty good textbook, the initial definition seems to be circular. I think this is partially due to the subjectivity of determining when to use what kind of model (including a multilevel model).

Another book, West, Welch, & Galecki's Linear Mixed Models says these models are for:

outcome variables in which the residuals are normally distributed but may not be independent or have constant variance. Study designs leading to data sets that may be appropriately analyzed using LMMs include (1) studies with clustered data, such as students in classrooms, or experimental designs with random blocks, such as batches of raw material for an industrial process, and (2) longitudinal or repeated-measures studies, in which subjects are measured repeatedly over time or under different conditions.

Finch, Bolin, & Kelley's Multilevel Modeling in R also talks about violating the iid assumption and correlated residuals:

Of particular importance in the context of multilevel modeling is the assumption [in standard regression] of independently distributed error terms for the individual observations within a sample. This assumption essentially means that there are no relationships among individuals in the sample for the dependent variable once the independent variables in the analysis are accounted for.

I believe that a multilevel model makes sense when there is reason to believe that observations are not necessarily independent of one another. Whatever "cluster" accounts for this non-independence can be modeled.

An obvious example would be children in classrooms—they are all interacting with one another, which might lead their test scores to be non-independent. What if one classroom has someone that asks a question that leads to material being covered in that class that isn't covered in other classes? What if the teacher is more awake for some classes than others? In this case, there would be some non-independence of data; in multilevel words, we could expect some variance in the dependent variable to be due to the cluster (i.e., class).

Your example of a dog versus an elephant depends on the independent and dependent variables of interest, I think. For example, let's say we are asking if there is an effect of caffeine on activity level. Animals from all over the zoo are randomly assigned to either get a caffeinated drink or a control drink.

If we are a researcher that is interested in caffeine, we might specify a multilevel model, because we really care about the effect of caffeine. This model would be specified as:

activity ~ condition + (1+condition|species)

This is particularly helpful if there are a large number of species we are testing this hypothesis over. However, a researcher might be interested in the species-specific effects of caffeine. In that case, they could specify species as a fixed effect:

activity ~ condition + species + condition*species

This obviously is a problem if there are, say, 30 species, creating an unwieldy 2 x 30 design. However, you can get pretty creative with how one models these relationships.

For example, some researchers are arguing for an even wider use of multilevel modeling. Gelman, Hill, & Yajima (2012) argue that multilevel modeling could be used as a correction for multiple comparisons—even in experimental research where the structure of the data is not obviously hierarchical in nature:

Harder problems arise when modeling multiple comparisons that have more structure. For example, suppose we have five outcome measures, three varieties of treatments, and subgroups classified by two sexes and four racial groups. We would not want to model this 2 × 3 × 4 × 5 structure as 120 exchangeable groups. Even in these more complex situations, we think multilevel modeling should and will eventually take the place of classical multiple comparisons procedures.

Problems can be modeled in various ways, and in ambiguous cases, multiple approaches might seem appealing. I think our job is to choose a reasonable, informed approach and do so transparently.

Best Answer

Related Solutions

Solved – Mixed effect linear regression model output interpretation

Mixed Effect Model – When and How to Use It

Related Question