I am writing an analysis plan for data that is collected on approximately 30 people at approximately 5 unevenly spaced time points. I am planning to analyze the data via a repeated measures mixed model, but I am unsure as to whether time should be treated as a continuous or discrete parameter in the model. I was thinking it made more sense to model time as continuous due to the uneven spacing of the visits, but a colleague noted that this assumes a linear relationship between the outcome variable and time. What are others' opinions on this? How is power affected by the choice of continuous vs. discrete? Unfortunately, I do not have the benefit of having the actual data to determine the relationship with the best fit. The model needs to be specified in advance.
Solved – modelling time as continuous vs. discrete
mixed modelrepeated measurestime series
Related Solutions
It would seem most important to ensure that the groups had similar QoL outcome measures at t1
or t2
, as those time points are fixed with respect to the intervention time. Your question says that you might include a t1
versus t0
comparison. That doesn't seem to make much sense in terms of evaluating the intervention, however, as the intervention doesn't happen until after t2
.
You might want to examine changes from t0
to t1
in a separate analysis with a continuous measure of time in days. That would help you evaluate whether the two treatment groups were adequately well matched both at enrollment into the study and at the time point t1
that (unlike t0
) occurs at a fixed time prior to the intervention. It would also let you see if there is any systematic change over time in the outcome measure absent the intervention.
If the groups are adequately well matched at t1
, however, I don't see any need to use values at t0
as part of evaluating the intervention itself. You might, however, need to evaluate them as part of your quality control.
In response to comments
I think it's important to distinguish the direct effects of the intervention from possible changes in QoL values associated with the treatment-group assignment, presumably done at t0
, which might lead to systematic differences between t0
and t1
.
With similar distributions of QoL values between the 2 treatment groups at t1
, the specific effects of interventions per se can probably be described as differences between pre-intervention (t1
, t2
) and post-intervention (t3
, t4
) QoL values. Think carefully how you want to do that, as the more coefficients you have to estimate the lower power you might have.
For example, might the QoL values at t1
and t2
be considered replicates rather than separate values? Might it make sense to model QoL differences between t1
and t2
against corresponding differences between t3
and t4
, both representing 13-day periods? You need to apply your knowledge of the subject matter to make those decisions.
You certainly should examine potential changes between t0
and t1
, but such changes would have to do with either the time interval or the group assignment (e.g., due to the potential psychological effects you mention) rather than with the intervention per se. They thus would require a type of explanation other than a direct effect of the intervention.
Don't overthink the t0
to t1
differences. What you presumably want to do is to assure yourself and your audience that any such differences between the 2 assignment groups are small enough not to affect your interpretation of the direct intervention effect. Don't worry so much about whether you have the "best" model for the t0
to t1
difference. Just develop one that's adequate to address that potential concern.
A simple analysis of the paired t1-t0
differences within individuals should be adequate and accomplish more simply what you propose in a comment to do with a mixed model. If you are only examining paired t1-t0
differences you don't need the time*treatmentgroup
interaction, just the treatmentgroup
assignment itself. Flexible inclusion of timeddifference
in the model of the t1-t0
QoL paired differences with a regression spline makes sense. You will need more than the 2 degrees of freedom you propose in the model in your comment, however, as that doesn't allow any knots at all. I prefer to model splines with the rcs()
function in the R rms()
package, in part because (unlike ns()
) it provides reasonable default parameter settings.
Best Answer
Time as continuous uses one degree of freedom (unless you include polynomials of course) - if it is treated as discrete, each dummy uses a degree of freedom. It may not be a big deal if you have lots of observations.
With 5 points of data, you may want to treat it as continuous; my experience is the more fixed time effects you have it gets harder to interpret their meaning, and I always end up looking for some sort of time trend anyways. Any nonlinearities can be treated with a quadratic or higher order term. Continuous time in a mixed model should be able to handle uneven spacing. That said, I'm not sure why you have to commit now. You can model both, and do a LR or Wald test to see which fit the data better.
If you really cannot touch the data before making the decision, and once its decided you have to run with it, then I would recommend you rely on theory (which often trumps anything the data will "tell" you). But what kind of situation are you in that you cannot make adjustments to the model after modeling? That kind of goes against all principles of modeling I've ever followed. You define your initial model, test its fit, test your various hypotheses, and adjust model. I personally would never feel comfortably fully specifying a finalized model before analyzing data.