Solved – How many data points do we need for mixed effects longitudinal data

mixed modelpanel datar

I am collecting longitudinal data using for 4 time waves. Although the survey is administrated to the same population, different individuals may decide to complete it at each time point. As a result there are a number of individuals that only completed it once, others that completed it twice, some that completed it three times and others who participated in all four waves. For example at the moment there are about 2000 participants in time 1 and 1900 for time 2 but only 1200 that participated in both time 1 and time 2 (at the moment i am still collecting data for time 3 so i don't know yet what the final matched sample will be).

The data are from different organizations so I would like to model this with using mixed effects with the lmer in R. e.g.


lmer(outcome~"some repeated variables"+"organization level variables"+timewave+(timewave|subject)+(1|organizations))

My questions are

  • Do i need to remove individuals who completed it only once or twice to use a random slope for time?
  • Is it meaningful to also try to fit a quadratic effect for time given that there are only 4 waves? (and would i need to remove subjects that have not participated in all four?)
  • Many thanks,

    George

    Best Answer

    • No, you don't need to remove individuals with data for only only one (or only a limited number) of timepoints. You're right to think that individuals with only one timepoint contribute nothing to estimation of the slope but they contribute to estimation of the intercept and you want to estimate both jointly. The maths and the algorithm deal with this so you don't need to worry about it, and you're more likely to make errors than the programmers of lmer if you try to second-guess things by dropping observations you don't think will contribute.

    • Yes, you could fit a quadratic effect for time with 4 timepoints. In fact, if you designed an experiment to look for a quadratic effect, you might well choose to have 4 timepoints. (3 would in principle maximise your power, but 4 allows 1 d.f. to test for fit of the quadratic curve). Clearly you need at least some people to participate in at least 3 waves. But as above, don't remove subjects that didn't.