Two-Way ANOVA – Conducting Repeated Measures ANOVA with Unbalanced Data

anovarrepeated measures

I have a system / operator repeated measures data set for which I would like to answer the questions:

  • is system 1 quicker than system 2
  • is operator 1 quicker than operator 2
  • does the order of using the systems matter

So I've hopefully understood the problem correctly to be a "system vs. operator with interactions" question.

The measurements in regard to both system and operator are randomised, but unbalanced, i.e., there are different number of measurements for each operator and the order in which they used the systems. (Unfortunate, but the way it goes and somewhat related to the randomisation.)

The data is in the format:

    subject time1   time2   order   operator
1   1       1269    422     1       2
2   2       795     327     2       2
3   3       1866    551     1       2
4   4       1263    382     2       2
5   5       1602    438     1       2
6   6       1359    423     2       2
7   7       850     415     2       1
8   8       1080    370     1       2
9   9       568     278     2       1
10  10      582     308     2       1
11  11      1094    654     2       2
...

I reformatted the data with the make.rm function (http://cran.r-project.org/doc/contrib/Lemon-kickstart/makerm.R) to make it one condition per line. This is so as to allow a "univariate" repeated measures analysis of the data (per the comments of the make.rm function).

Therefore the data is now:

    subject order   operator    repdat  contrasts
1   1       1       2           1269    T1
2   2       2       2           795     T1
...
x   1       1       2           422     T2
y   2       2       2           327     T2
...

And then I perform the analysis of variance with this data:

analysis <- aov( repdat ~ order * operator + Error( subject / ( order * operator ), data = data.formatted ) )

with a result, but also an error message: "Error() model is singular"

So my questions are:

  • Have I understood the problem correctly? Am I even doing it right?
  • Are the results valid?
  • Why do I get an "Error() model is singular"
  • What impact does the unbalanced nature of the data play?

Best Answer

You are using the syntax for within-subject 2-way repeated measures ANOVA. That would assume that all combinations of order and operator are repeated within each subject. That's not what you have. I think some degree of missingness might be OK, but in your data each subject has only one combination of order and operator, so the within-subject variance of those effects cannot be assessed.

Another issue is that you lost the "system" variable, which is called contrasts in your second output. There are replicate measurements of contrast within subject, though. So perhaps something like the following might work (I have not thought through how many of the interaction terms are estimable):

analysis <- aov( repdat ~ contrasts*  order * operator + Error( subject / contrasts), 
                data = data.formatted ) )

As a simpler, alternative solution, you might want to consider modeling time1 - time2 in the original data - that measures the system effect, as a function of order and operator:

mod <- aov(I(time1 - time2) ~ order * operator, data = orig.data)

This would not work for more than two systems, but here there is an additional benefit that you could try other measures of the effect size such as the ratio, or log-ratio, which might better fit the assumptions of ANOVA. Differences in timing outcomes are often non-normal and heteroscedastic.

Related Question