Generalized Linear Model – Difference Between Generalized Estimating Equations and GLMM

generalized linear modelgeneralized-estimating-equationsinterpretationlogisticmixed model

I'm running a GEE on 3-level unbalanced data, using a logit link. How does this differ (in terms of the conclusions I can draw and the meaning of the coefficients) from a GLM with mixed effects (GLMM) and logit link?

More detail: The observations are single bernoulli trials. They are grouped clustered into classrooms and schools. Using R. Casewise omission of NAs. 6 predictors also interaction terms.

(I am not flipping children to see if they land heads-up.)

I'm inclined to exponentiate the coefficients to odds-ratios. Does this have the same meaning in both?

There is something lurking in the back of my mind about "marginal means" in GEE models. I need that bit explained to me.

Thanks.

Best Answer

In terms of the interpretation of the coefficients, there is a difference in the binary case (among others). What differs between GEE and GLMM is the target of inference: population-average or subject-specific.

Let's consider a simple made-up example related to yours. You want to model the failure rate between boys and girls in a school. As with most (elementary) schools, the population of students is divided into classrooms. You observe a binary response $Y$ from $n_i$ children in $N$ classrooms (i.e. $\sum_{i=1}^{N}n_{i}$ binary responses clustered by classroom), where $Y_{ij}=1$ if student $j$ from classroom $i$ passed and $Y_{ij}=0$ if he/she failed. And $x_{ij} =1$ if student $j$ from classroom $i$ is male and 0 otherwise.

To bring in the terminology I used in the first paragraph, you can think of the school as being the population and the classrooms being the subjects.

First consider GLMM. GLMM is fitting a mixed-effects model. The model conditions on the fixed design matrix (which in this case is comprised of the intercept and indicator for gender) and any random effects among classrooms that we include in the model. In our example, let's include a random intercept, $b_i$, which will take the baseline differences in failure rate among classrooms into account. So we are modelling

$\log \left(\frac{P(Y_{ij}=1)}{P(Y_{ij}=0)}\mid x_{ij}, b_i\right)=\beta_0+\beta_1 x_{ij} + b_i $

The odds ratio of risk of failure in the above model differs based on the value of $b_i$ which is different among classrooms. Thus the the estimates are subject-specific.

GEE, on the other hand, is fitting a marginal model. These model population-averages. You're modeling the expectation conditional only on your fixed design matrix.

$\log \left(\frac{P(Y_{ij}=1)}{P(Y_{ij}=0)}\mid x_{ij}\right)=\beta_0+\beta_1 x_{ij} $

This is in contrast to mixed effect models as explained above which condition on both the fixed design matrix and the random effects. So with the marginal model above you're saying, "forget about the difference among classrooms, I just want the population (school-wise) rate of failure and its association with gender." You fit the model and get an odds ratio that is the population-averaged odds ratio of failure associated with gender.

So you may find that your estimates from your GEE model may differ your estimates from your GLMM model and that is because they are not estimating the same thing.

(As far as converting from log-odds-ratio to odds-ratio by exponentiating, yes, you do that whether its a population-level or subject-specific estimate)

Some Notes/Literature:

For the linear case, the population-average and subject-specific estimates are the same.

Zeger, et al. 1988 showed that for logistic regression,

$\beta_M\approx \left[ \left(\frac{16\sqrt{3}}{15\pi }\right)^2 V+1\right]^{-1/2}\beta_{RE}$

where $\beta_M$ are the marginal esttimates, $\beta_{RE}$ are the subject-specific estimates and $V$ is the variance of the random effects.

Molenberghs, Verbeke 2005 has an entire chapter on marginal vs. random effects models.

I learned about this and related material in a course based very much off Diggle, Heagerty, Liang, Zeger 2002, a great reference.