Solved – Time varying covariates in longitudinal mixed effect models

lme4-nlmemixed modelpanel datastatatime-varying-covariate

I am looking for some help with my analysis of longitudinal data with time-varying covariates. I am planning to use R and the lme4 package. However, I am happy to use Stata also. I am interested in looking at the relationship between cognition and taking ACE inhibitors in longitudinal data

The example dataset is below:

df = data.frame(cognition = rnorm(200)+2,
                wave = rep(c("W1", "W2", "W3", "W4"), each = 50) ,
                hypertension = c(rep(c( "Y","N", "N", "N", "Y", "N", "N", "N", "N", "N"), 5),  
                                                rep(c( "Y","Y", "N", "N", "Y", "N", "N", "N", "N", "N"), 5) ,
                                                rep(c( "Y","Y", "N", "N", "Y", "N", "Y", "Y", "N", "N"), 5) ,
                                                rep(c( "Y","Y", "N", "N", "Y", "N", "Y", "Y", "Y", "N"), 5) ) 

                ,
                diabetes = c(rep(c( "N","N", "N", "N", "Y", "N", "N", "N", "Y", "N"), 5),  
                                 rep(c( "Y","N", "N", "N", "Y", "N", "N", "N", "Y", "N"), 5) ,
                                 rep(c( "Y","N", "N", "N", "Y", "N", "Y", "Y", "Y", "N"), 5) ,
                                 rep(c( "Y","Y", "N", "N", "Y", "N", "Y", "Y", "Y", "Y"), 5) ) 

                ,
                Smoking = c(rep(c( "Current","Never", "Never", "Former", "Current", "Former", "Former", "Former", "Never", "Current"), 5),  
                            rep(c( "Current","Never", "Never", "Former", "Current", "Former", "Former", "Former", "Never", "Former"), 5) ,
                            rep(c( "Former","Never", "Never", "Former", "Current", "Current", "Former", "Former", "Never", "Former"), 5) ,
                            rep(c( "Former","Never", "Never", "Current", "Current", "Current", "Former", "Former", "Never", "Former"), 5) ) 

                ,
                TakingACEinh =   c(rep(c( "Y","N", "N", "N", "N", "N", "N", "N", "N", "N"), 5),  
                                   rep(c( "Y","Y", "N", "N", "N", "N", "N", "N", "N", "N"), 5) ,
                                   rep(c( "Y","Y", "N", "N", "Y", "N", "Y", "Y", "N", "N"), 5) ,
                                   rep(c( "Y","N", "N", "N", "Y", "N", "Y", "Y", "Y", "N"), 5) ) ,
                id = rep(1:50, 4)
                )



Hypertension is the diagnosis of hypertension at each wave (timepoint) – once a person has been diagnosed they cannot go back to being non-hypertensive, the same is true for the variable diabetes. However, there are variables such as smoking that can differ and change over the different waves. Also Taking ACE inhibitors: someone can take this drug in one wave but then in others, they might not. How do I model these variables in my mixed effect model?

I was thinking of two approaches:
1) Keep the data as is and use lme4 but still not sure which is the correct model

library(lme4)
lmer(cognition ~ factor(wave) + hypertension + Smoking + diabetes + TakingACEinh + (1|id), data = df)
lmer(cognition ~ factor(wave) + hypertension + Smoking + diabetes + TakingACEinh + (1|id) + (1+TakingACEinh|Wave), data = df)

2) Recode the variable hypertension to indicate if a person is 0 non hypertensive, 1 = newly hypertensive, 2 = previous and currently hypertensive and perform the models again using the code above

If anyone has any suggestions on how to model and analyse this type of data please let me know and thanks for your help.

Best Answer

Dealing with time-varying covariates in mixed models but also in general is a challenging task. A few points to consider:

  • I would differentiate between time-varying covariates, such as smoking, and intermediate events, such as hypertension in your example.
  • For time-varying covariates you need first to consider if they are endogenous or exogenous. Loosely speaking, a time-varying covariate is exogenous if its current value at time, say $t$ is only associated with its previous values at times points $0 \leq s < t$, but it is not further associated with previous values of the outcome at these previous time points. The covariate will be endogenous if this is not the case. Endogenous covariates are in general more difficult to handle, and require specialized models, such as, joint models or marginal structured models.
  • An additional challenge with time-varying covariates is the functional form. That is, if you just include smoking as a time-varying covariate in your mixed model, then you have a type of cross-sectional relationship, namely, you say that the cognition at time $t$ is only associated with smoking at the same time point $t$. But it could be that the cognition at $t$ is also associated with smoking at previous time points. For example, cognition at $t$ depends not only on whether you smoke at time $t$ but rather on how much you have smoked up to $t$. In this case, you will need to construct a new time-varying covariate which is the cumulative smoking.
  • For intermediate events you also have similar considerations with endogeneity. But instead of including such an event just as a covariate in the model, it would be perhaps more logical to assume that it interacts with time, i.e., that after the intermediate event occurred you perhaps have a changed in the slope of cognition.
Related Question