Solved – How to analyse longitudinal data from a randomised controlled trial in SPSS using mixed effects models

mixed modelrandom allocationspss

I am analyzing data from a randomized clinical trial, with 2 intervention groups (placebo and intervention) and repeated measurements over time. I am planning to use linear mixed effects modeling to analyze this longitudinal data and determine whether the intervention causes a change in response over time compared to the control.

More specifically, the outcome variable “six_min_wd” is the walking distance in a standardized walking test (6-minute walking test). I hypothesized that the walking distance will increase in the intervention group over time compared to the control group.

I’ve tested this hypothesis using the following syntax in SPSS:

MIXED six_min_wd BY treatment WITH visit
/FIXED=treatment visit treatment*visit
/METHOD=ML
/PRINT=SOLUTION TESTCOV
/RANDOM=INTERCEPT | SUBJECT(id) COVTYPE(UN)
/REPEATED=visit | SUBJECT(id) COVTYPE(UN).

“Treatment” is a binary variable for the two intervention groups (0=control, 1=intervention) and “visit” a continuous variable for the three repeated measures (at baseline (0), week 1 (1) and week 8 (8)). A significant interaction term “treatment*visit” would tell me that the two intervention groups significantly differ over time. Are those assumptions correct?

Is the /RANDOM subcommand required in this context? From what I understand the /REPEATED subcommand should suffice?

Secondly, I know that my outcome variable (walking distance) is also affected by other variables, such as age (walking distance expected to decrease with age) or BMI (decrease expected with higher BMI). My approach to controlling for these covariates would be to include those variables as additional terms in the /FIXED subcommand:

MIXED six_min_wd BY treatment WITH visit bmi age
/FIXED=treatment visit treatment*visit bmi age
/METHOD=ML
/PRINT=SOLUTION TESTCOV
/RANDOM=INTERCEPT | SUBJECT(id) COVTYPE(UN)
/REPEATED=visit | SUBJECT(id) COVTYPE(UN).

Is this the appropriate way to control for these variables?

I spent quite a lot of time reading about mixed effects models, but a review of the actual approach to my situation would be greatly appreciated, since I might miss something and be completely off with my planned analysis. Many thanks.

Best Answer

You should ideally specify your unadjustet and adjusted model beforehand offcourse. A sound scientific practice requires that the variables for the adjusted model should be decided apriori on based on prior knowledge of variables that have a high correlation with the dependent (outcome variable) and variables should not be cherry-picked from analysing baselinedata. Also beware of the risk of overfitting by throwing too many variables into the model and playing fort and back with putting variables in and leaving them out (lot of litterature on this topic). Regarding your outcome variable however, baseline differences between groups shoul allways be adjusted for and specified in a traditional ANCOVA (e.g. linear regression) (see Altman / Vickers for traditional points on adjusting for dependent variable baseline in the BMJ series on statistic).

However, the topic of how to specify the model and adjust for baseline-differences of the dependent variable between groups is a hot one when it comes to mixed model. (see Twisk 2018 (some errors in this article) and se articles and a freshly published book on analysing randomized trials with mixed model by a japaneese statitistician: Toshiro Tango)

So far I am inclined to follow Tango's suggestions. Thus, specifying the model similar to like this (random intercept):

Yt (ij) = B0 + B1 X + B2 time + B3 time*group + b(ij) + e(ij)

Were Yt is the outcome/dependent variable (walking time in your case). "t" denotes that Y is a function of time (ij) denotes that Y is based on repeated meassurements nested in each individual(i) and time(j). B0 - denotes regression coefficient for the control group - i.e. mean at baseline B1 - denotes the baseline differnce for the treatment group (X specified as 0 for control group and 1 for treatment group) B2 - effect of time for the control group - i.e. post mean value is B0+B2 B3 - the difference in effect of time*group - i.e. the difference between control and treatment - this is the coefficient you normally would use to assess the effect estimate of the treatment compared to control and conclude on wheather to reject H0 (that the there is noe difference between groups). Bij is here the random intercept - basically just assessing the individual variance at baseline - it should by definition be a normal distribution with mean 0. eij is the error term (This model is simplified a bit by leaving out random slope which is a debate of it's own)

This model is only specified with time as pre post, but with repeated meassurements you just add time points and interaction between time and group - e.g. B4 time2 B6 time2*X B7 time 3 ... etc). If you only have baseline and one follow up meassurment then traditonal ACOVA (regression) might be a better choice than mixed model. One of the great advantages of mixed model is the way you can handle missing without imputation etc. as long as you can assume missing at random.

Since I do not use SPPS I cannot help you with the exact syntax for running th model above in SPSS, but I bet others can. Hope it was a bit helpfull even though this topic can be more confusing than one would expect at first.

Related Question