Solved – How to report result from two-way repeated measures ANOVA

anovarepeated measuresreportingstatistical significance

I have conducted an empirical study as part of my master thesis. I have developed a software tool where cards can be sorted (affinity diagram) and in the study each test subject did a number of sorting tasks, to examine how the digital search feature improved completion time in sorting tasks. I am wondering if a Two-way, repeated measures ANOVA is appropriate to use when analyzing the data.

I had two independent variables:

  1. dataset size (10, 30 or 50 cards)
  2. digital search feature ("on" or
    "off").

Which means I had 6 task in total, and the subject had to find 10 cards in each task, which was more or less difficult based on the dataset size and if digital search were enabled or disabled.

I have two hypotheses that I want to validate:

  1. Users will be able to achieve significant performance increases in
    card sorting tasks when using digital search.
  2. Digital search are
    expected to play a small role when organizing a small set of cards.
    Digital search will therefore primarily outperform non digital
    search when organizing large sets of cards.

It was a within-group study, with 14 participants (i.e. all subjects did all 6 tasks). The output is as follows (completion time in seconds):

|---------------No search--------------|----------------Search----------------|
|  10 cards  |  30 cards  |  50 cards  |  10 cards  |  30 cards  |  50 cards  |
|-----------------------------------------------------------------------------|
|     34     |    103     |     171    |     22     |     22     |     26     |
|     24     |     41     |      78    |     20     |     28     |     28     |
|     37     |     60     |     141    |     19     |     23     |     24     |
|     33     |     69     |     122    |     30     |     26     |     24     |
|     26     |     52     |     227    |     22     |     29     |     38     |
|     33     |     57     |     100    |     26     |     35     |     34     |
|     33     |     87     |     148    |     25     |     25     |     26     |
|     30     |     86     |     113    |     25     |     26     |     37     |
|     30     |    127     |     156    |     26     |     27     |     28     |
|     23     |     42     |     130    |     17     |     16     |     17     |
|     22     |     75     |     112    |     22     |     34     |     36     |
|     36     |     95     |     208    |     24     |     23     |     36     |
|     30     |     89     |     105    |     23     |     21     |     25     |
|     55     |    118     |     216    |     37     |     49     |     53     |
|-----------------------------------------------------------------------------|

I have run a Two-way, repeated measures ANOVA test in matlab with this matlab implementation which gave the following output:

'Source'                           'SS'          'df'  'MS'          'F'         'p'         
'Dataset Size'                     [4.2174e+04]  [ 2]  [2.1087e+04]  [ 62.2973]  [1.2110e-10]
'Task type'                        [1.0926e+04]  [ 1]  [1.0926e+04]  [ 31.0805]  [8.9967e-05]
'Dataset Size x Task type'         [1.0708e+05]  [ 2]  [5.3539e+04]  [109.0535]  [2.2704e-13]
'Dataset Size x Subj'              [8.8008e+03]  [26]  [  338.4936]          []            []
'Task type x Subj'                 [4.5699e+03]  [13]  [  351.5311]          []            []
'Dataset Size x Task type x Subj'  [1.2765e+04]  [26]  [  490.9460]          []            []

My statistical knowledge is quite limited, so I am not sure if this significance test is relevant, or how to report the output.

I guess the following can be used, when arguing that both hypotheses are true?

  • F (2, 26) = 62.297, p < .001.
  • F (1, 13) = 31.080, p < .001.
  • F (2, 26) = 109.053, p < .001.

Bonus questions: Is a paired-sample t test appropriate to test if it took significant longer time to find 10 cards among 50 cards vs 10 cards when search were enabled (i.e. compare 4th and 6th column). I get a significant difference (t(13)=3.76, p<0.01) if I use a t test.

Best Answer

Yes, those tests are relevant. The Type test asks whether the average No-Search time is the same as the average Search time. The Size test asks the same question about the average 10-, 30- and 50-card times. The Size x Type test asks whether the average difference between the No-Search and Search times is the same for all three numbers of cards. In each case, the p-value tells you how likely you are to see differences as big as or bigger than those in your data if the true averages were truly equal. Extremely small p-values are interpreted as allowing you to reject the hypothesize equalities.

However, you've analyzed the wrong variable. For variables such as time-to-completion, longer times are times are generally more variable than shorter times. Your means and standard deviations correlate .997, which is too big too ignore and is characteristic of lognormally-distributed data, so you should analyze the logs of the times. The interpretation replaces differences by ratios: instead of saying "30 cards took 69 seconds longer than 10 cards", you say "30 cards took 3 times as long as 10 cards".

Someone at your thesis oral may ask whether you adjusted for non-sphericity, which the analysis you reported does not do. The adjustment may affect the results of the Size and Size x Type tests. Google for Greenhouse-Geisser and Huynh-Feldt, and note that you are not interested in testing sphericity but in adjusting for it.

Yes, paired-sample t-tests are appropriate. The Type test is actually just a paired t on each subject's total over the three Size conditions for each Search condition.

And a comment: When I was a grad student, the head of my department used to say that no one would ever get a degree while he was in charge if they reported anova results without first reporting the means and standard deviations. I imagine that if he were alive today he would probably want plots, as well. They can be remarkably informative.

Related Question