Solved – What are the differences between the linear regression and mixed models

fixed-effects-modelmixed modelrrandom-effects-modelregression

What are the key differences between the following two models?

lmefit = lmer(MathAch ~ SES + (1 |School) , MathScores) 

lmfit = lm(MathAch ~ SES + factor(School) -1 , MathScores) 

To me, they seem to be the same, except that lmefit takes less parameters (because it used Normal distribution to model the levels at the group level…)

Am I right?


And what's the difference between these two models?

M0 <- lmer (y ~ 1 + (1 | county))
M1 <- lmer (y ~ -1 + (1 | county))

Best Answer

The main difference comes in what types of questions you are trying to answer with your analysis and how you consider the factor school.

In lmfit you are considering school to be a fixed effect, which means that you are only interested in the schools that are in your data set, but you are (possibly) interested in specific differences between the schools. With this model you cannot say anything about students at schools that are not in your sample (because you have no information on their fixed effect).

In lmefit your are considering school to be a random effect, or essentially the schools in your data set are a random sample from a larger population of larger schools. Here you are generally uninterested is specific comparison between schools, but could be interested in prediction for schools in the original sample and predictions for schools that were not in the original sample.

If I have data from all the schools in my area and am interested in seeing if there is a difference between 2 schools that I am considering sending my children to (and if so which is better) then I would use the fixed effects model.

If I am interested in making predictions and may make a prediction for schools not in my data set (since I only have a subset of the schools) then I would use the mixed effects model.

If I believe that there could be an effect due to schools, but I don't care specifically about comparisons between schools, just want to allow the model to adjust for the clustering, but will do all inference on the SES variable, then I would use the mixed effect model (though the fixed would work in this case, but using as an adjustment is a bit more natural as a random effect).