Solved – Dealing with missing data in Repeated Measures ANOVA

anovamissing data

Hi 🙂 I have a data set of comprised of:

different subjects, each tested within subject (three time points) on a continuous scale, on various (unrelated) dependant measures.

Some of these subjects "joined late" to the study – hence they sometimes don't have the first within subject point, or both of the first ones.

How should I handle this missing data..? just ignore it? use RM ANOVA or mixed design…?

Thank you!

Best Answer

Just ignoring missing data (i.e. analyzing only the observed data) asssumes that the observed available data are completely representative of the missing data, which requires that the missingness has no connection whatsoever with the outcomes you are interested in (this is called "missing completely at random", MCAR). This is very rarely the case. Additionally, while analyzing only the complete cases may be valid, if this were the case, it would not be the most efficient analysis (and usually a mixed model assuming MAR - see below - is more efficient).

Doing a mixed effects model that implicitly imputes the missing values assumes that missingness can be explained by randomness, the model covariates, as well as the observed values (this is called "missing at random", MAR). An analysis valid under MAR is also valid under MCAR (MCAR being part of what is considered MAR). There are also other options besides a mixed model, e.g. there's the option of doing some kind of multiple imputation (possibly having more variables in the imputation model than in the analysis model) and then doing an analysis by time point.

You can actually distinguish MAR from MCAR based on your data, but you cannot tell whether instead of one of these two situations you have a missing completely not at random situation, in which neither of the two analysis options mentioned above would be valid. With people joining late, you will have to think about whether it seems plausible that this has very little to do with the missing outcomes (or perhaps these people have different observed characteristics, but you think it's plausible that that's the main difference).

Related Question