Linear Regression vs Fixed Effects Model – Calculating Slope of Clinical Parameters Over Time

mixed modelregression

I am studying how a clinical parameter (GFR, a measure of kidney function) varies over time across subjects. For each patient, we have a dataset of multiple GFR readings and the associated time they were measured. We summarized the change in GFR over time by performing linear regression for each subject to obtain the slope, which we then try to associate with other clinical parameters.

My supervisor was reading a paper where a mixed effects model was used to calculate slope instead, and suggested I do the same. However, conceptually, I'm not sure if this makes sense since slope is being calculated separately for each patient, so there is no other "effect" to include in the model. Here is how the paper phrased it:

"To calculate GFR slope, a linear mixed effects model including random slope and intercept was performed"

Can anyone explain to me how this is different from simple linear regression?

Thanks in advance!

Best Answer

"A linear mixed effects model including random slope and intercept" doesn't only calculate random slopes and intercepts. The random slopes and intercepts are differences in slope and intercept from overall estimates of slope and intercept. What's different from a set of standard linear regressions is that those differences are modeled in terms of a best-fitting Gaussian distribution around the overall estimates rather than using individual regressions as you have done so far.

This provides for a more efficient use of your data. Instead of estimating slopes and intercepts for each of your study participants you pool information among all of them in a single model. In the simplest case, you only model an overall slope and intercept and the variance and covariance of the random slopes and intercepts. That's a lot fewer parameter values to estimate from your data.

You can model associations between clinical variables and outcome by including interactions between the slope with respect to time and the clinical variables in the model. The random slopes and intercepts are then estimates of what those would have been for each participant at the baseline values of those clinical variables, generally allowing for more precise estimates of the interaction terms that represent associations of the clinical variables with that slope. The tag info on mixed models is a pretty helpful place to start, with links to references for further reading.

That isn't the only way to work efficiently with such longitudinal data. For example, generalized least squares is another way to account for within-participant correlations among observations that can be useful in a situation like yours, with a continuous outcome value and only within-participant correlations to deal with. Chapter 7 of Frank Harrell's course notes and book discuss the relative advantages of several ways to handle such data.

Related Question