Solved – Might be an unbalanced within subjects repeated measures

repeated measuresunbalanced-classesvariance

I ran a within subjects repeated measures experiment, where the independent variable had 3 levels. The dependent variable is a measure of correctness and is recorded as either correct / incorrect. Time taken to provide an answer was also recorded.

A within subjects repeated measures ANOVA is used to establish whether there is significant differences in correctness (DV) between the 3 levels of the IV, there is significant. Now, I'd like to analyze whether there is significant differences in the time taken to provide the answers when the answers are 1) correct, and 2) incorrect.

My problem is: Across the levels there are different numbers of correct / incorrect answers, e.g. level 1 has 67 correct answers, level 2 has 30, level 3 has 25.

How can I compare the time take taken for all correct answers across the 3 levels? I think this means its unbalanced? Can I do 3 one way ANOVAS to do a pairwise comparison, while adjusting p downwards to account for each comparison?

Thanks

Best Answer

It's not imbalanced because your repeated measures should be averaged across such subgroups within subject beforehand. The only thing imbalanced is the quality of the estimates of your means.

Just as you aggregated your accuracies to get a percentage correct and do your ANOVA in the first place you average your latencies as well. Each participant provides 6 values, therefore it is not imbalanced.

Most likely though... the ANOVA was not the best analysis in the first place. You should probably be using mixed-effect modelling. For the initial test of the accuracies you'd use mixed effects logistic regression. For the second one you propose it would be a 3-levels x 2-correctnesses analysis of the latencies. Both would have subjects as a random effect.

In addition it's often best to do some sort of normality correction on the times like a log or -1/T correction. This is less of a concern in ANOVA because you aggregate across a number of means first and that often ameliorates the skew of latencies through the central limit theorem. You could check with a boxcox analysis to see what fits best.

On a more important note though... what are you expecting to find? Is this just exploratory? What would it mean to have different latencies in the correct and incorrect groups and what would it mean for them to interact? Unless you are fully modelling the relationship between accuracy and speed in your experiment, or you have a full model that you are testing, then you are probably wasting your time. A latency with an incorrect response means that someone did something other than what you wanted them to... and it could be anything. That's why people almost always only work with the latencies to the correct responses.

(these two types of responses also often have very different distributions with incorrect much flatter because they disproportionately make up both the short and long latencies)

Related Question