GEE – Comparing GEE vs Mixed Model for Time-Varying Covariate Analysis

generalized-estimating-equationsmixed modelrepeated measurestime-varying-covariate

Assuming the attached dataset, I am looking to examine whether participants who are treated are more/less likely to have high addiction severity compared to non-treated.

Both treatment and high addiction severity are time-varying (ie, re-assessed at each visit).

Given the correlated nature of the data, I think GEE would be a good fit. If my understanding in right, this approach would tell me if, on average, treated people are more/less likely to have high addiction severity compared to non-treated people.

However, someone has suggested that I use a mixed model to explore differences among treated and non-treated individuals with respect to this covariate (and other ones).

I have read a bit about the distinction between the 2 methods and understand that GEE gives us population-average effects whereas a mixed model would give us subject-specific effects (between-subject differences are treated as random). This distinction makes sense to me in the context of a time-fixed covariate but cannot really understand how it would work in the context of a time-varying covariate …Would this analysis make sense, and if so, how would the results be different compared to a GEE?

There is an issue of temporality/reverse causality here but please ignore for now as I am trying to understand the distinction between the 2 methods.

Thanks so much in advance.enter image description here

Best Answer

Indeed, for dichotomous outcomes, as you seem to have here, the corresponding mixed effects model, namely a mixed effects logistic regression gives you fixed effects coefficients that have an intepretation conditional on the random effects. A detailed explanation can be found here. Most often, this is not the interpretation you want. The GEE approach does give you coefficients with a marginal / population-averaged interpretation.

However, an additional practical point that you also need to consider is missing data. You have not given us enough details with regard to this point for your application, but almost always we have to deal with incomplete data. With regard to this point, mixed models give you valid results under the less stringent missing at random assmuption compared to the (standard not weighted) GEE that give you valid results under the less realistic missing completely at random assumption.

Taking both points (i.e., interpretation and missing data) into account, you would most often like to fit a mixed model to be more protected for the missing data but want to obtain parameters that have a population averaged intepretation. An early solution towards this direction was the marginalized mixed models propoposed by Heagerty, but, in general, these are computionally intensive to fit. A more recent approach that seems to solve the problem has been proposed by Hedeker et al. This is implemented in the function marginal_coefs() of the R package GLMMadaptive. You can find an example on how to use this function here.

Related Question