Solved – Linear mixed model for repeated measures analysis with missing values

anovamissing datamixed modelrepeated measures

we have ran an experiment where we compared three interaction techniques for a 3d docking task. So we had two factors: the aforementioned technique type and a factor representing the direction translation (i.e.: if participants had to move an object that appeared close to their viewpoint and move it in depth or vice versa). Each trial was repeated 5 times.

Some of those trials were skipped because of the difficulty. If I run a regular repeated measures anova then each participant that even a single missing value will be dropped from the analysis. This means that I'd have to remove more than half of the participants. By reading around it seems I can use a linear mixed model instead.

My doubt is, can I use a mixed model for this type of situation? I am confused as to whether mixed models are only relevant when you have a between-subjects factor such as the classic treatment/control groups. In my case every participant was subjected to the same conditions. There were no between-subjects factors.

I ran the mixed model analysis by using Technique, Direction and Repetition as Repeated, the ID of each participant as the subject and technique and direction as fixed factors.

Are my assumptions correct or did I do a terrible mistake?

If so, what alternatives do I have when dealing with missing values?

Thanks!

Best Answer

Here is a background paper, "Normal distribution based pseudo ML for missing data: With applications to mean and covariance structure analysis" with full text available at http://www.sciencedirect.com/science/article/pii/S0047259X09001079 .

Per this reference to quote some pertinent comments:

"When the population follows a confirmatory factor model, and data are missing due to the magnitude of the factors, the MLE may not be consistent even when data are normally distributed. When data are missing due to the magnitude of measurement errors/uniqueness, MLEs for many of the covariance parameters related to the missing variables are still consistent."

The aforementioned paper identifies and discusses factors that impact the asymptotic biases of the MLE for data that are not missing at random.

A technique that I believe has value in providing a solution is to replace the missing value, see "What to Do about Missing Values in Time-Series Cross-Section Data", full text available at https://www.google.com/url?sa=t&source=web&rct=j&ei=Ecb4U5rgEfLfsAS_BQ&url=http://gking.harvard.edu/files/pr.pdf&cd=10&ved=0CDcQFjAJ&usg=AFQjCNGpnhU_8okmEeNqnvRLCppKDEromw&sig2=Q5pAcvzVVH4IWmzFQFPe8A

To quote the author, the paper suggests the "concept of “multiple imputation,” a well-accepted and increasingly common approach to missing data problems in many fields. The idea is to extract relevant information from the observed portions of a data set via a statistical model,to impute multiple (around five) values for each missing cell,and to use these to construct multiple “completed” datasets."