R – How to Use Quadratic Terms in GLMER for Logistic Mixed Models

lme4-nlmelogisticmixed modelpanel datar

I'm looking for some references that explain step by step how to model logistic regression to longitudinal data (repeated measurements) in R.
I know that I can use the lme4 package and the function glmer for generalized linear mixed models, and I use it to add random effects. But I've read some stuff and sometimes people add quadratic terms and multiply and sometimes they don't. Can someone clarify me on this?

For example, on the book "Applied Longitudinal Analysis by Fitzmaurice", in chapter 14.7, he models the logistic regression for a dataset like this:

model1 <- glmer(y ~ time + time2 + trt.time + trt.time2 + (1 | id), family=binomial, nAGQ=50, na.action=na.omit)

where:

  • $\texttt{time2 <- time}^2$

  • $\texttt{trt.time <- trt }\times\texttt{ time}$

  • $\texttt{trt.time2 <- trt } \times\texttt{ time2}$

Why doesn't he simply use model1 <- glmer(y ~ time + trt+ (1 | id), family=binomial, nAGQ=50, na.action=na.omit)?
I run this last model in R and the AIC and BIC are basically the same.

This is the type of questions that I'm having concerning the R code on this matter. I can't find many literature with R on logistic regression for longitudinal data, and this all very confusing. Can someone explain me when to use quadratic terms, or multiply/add them? Or recommend me a reference.

I also found this topic but it didn't help much.

Best Answer

Quadratic terms

sometimes people add quadratic terms and multiply and sometimes they don't

Changing in time

The value that is being modeled might be changing in time. See for instance below a display of your data ( available via the page of the author of the book that you link to https://content.sph.harvard.edu/fitzmaur/ala2e/ and https://content.sph.harvard.edu/fitzmaur/ala2e/R_sect_14_7.html ) :

fractions

Typical logistic regression

With a typical simple logistic regression that only includes a linear function of time (note that this linear function is wrapped inside a non-linear link function) the fraction/probability of the binary outcome is modeled as a logistic curve:

$$p = \underbrace{ \text{logistic}(\underbrace{\beta_0 + \beta_1 \times time}_{\text{linear part}})}_{\text{non-linear function}} = \frac{1}{1+\text{exp}(-\beta_0 - \beta_1 \times time)}$$

a fit to the data will look like:

simple linear function

More variance as function of time by adding quadratic term

In the above example the logistic curve is being fitted by stretching and shifting. By adding a quadratic term the change in time may be expressed with more flexibility. This will improve the fit.

with quadratic term

The effect might be a bit difficult to see because both the curves are not linear (because they are wrapped in the link function). However when we plot the log odds then it may look more clear:

log odds

Multiply with other factors

multiply and sometimes they don't

Interactions

The specific data is for two different types of treatment (two different doses). When you plot the fractions separately for the two different treatments then you can see that as function of time there is a difference for the dependency as function of time.

with treatment effect

Note that the multiplication trt * time is done with a variable trt that either has the value 0 or 1. Sometimes these models use cross terms with variables that have multiple values in which case the multiplication must be done for each variable separately (see dumy-coding).


When to use

Can someone explain me when to use quadratic terms, or multiply/add them? Or recommend me a reference.

The book "Applied Longitudinal Analysis by Fitzmaurice", that you refer to explains it. See in the example R-code where different models are compared.

the AIC and BIC are basically the same

AIC, BIC and F-test are various test to compare it. In the example from the book they seem to use a F-test. Yes AIC and BIC might be basically the same but the quadratic model does provide a better estimate (the AIC and BIC look the same because the values are both so large, but the difference in log likelihood, about 6, is relatively large).

You should be very careful with the interpretation. These tests may give small p-values, which means that you can predict the values for the four individual times very well, but the model may still be highly biased for other values (and interpolation and extrapolation may be completely/extremely wrong).

In this case, with just four time points, I would personally not model the fraction amenorrhea as a function of time. Or at least I would not apply a function that is more complex than a linear function, and if I would like to include more flexibility as function of time then I would turn the time variable into a categorical variable. The use of a quadratic function for just four time points is a bit meaningless. It will create a better fit, but it is just over-fitting and one should not interpret the model as correct such that one can apply interpolation or extrapolation.


In the above I used the glm function instead of the glmer function because the fit by glmer is less intuitive (it will not overlap with the fraction of amenorrhea because the random offset for the different individual, which may have one or more NA values, will change the fit a lot such that the predicted means will not overlap a lot with the observed means)

Related Question