Solved – What statistical test is appropriate for paired data where the same subjects are tested at multiple times

experiment-designpaired-comparisonspaired-datat-test

In the test in question, we have data [measurements of discomfort] for the same subjects with different virtual reality experiences. We took multiple measurements of discomfort over the course of the experience. We want to measure the significance between difference in discomfort between the two experiences.

In effect, we have a 3D table of data where the dimensions are the different experiences, the subjects that volunteered, and the measurements over time.

Measurements from the same subject at the same time but different experiences are paired and would be appropriate for a paired t-test, but measurements from the same subject at different times are highly correlated and so [if I understand the paired t-test properly] this would be inappropriate to analyze with.

Is there some other version of a t-test that returns a significance value but can deal with data like this? Is the best option just paired t-tests per time slice and "combining" significance values by some statistical method?

Best Answer

If I understand your setup, this experiment is as follows:

  • 2 independent variables (VR scenario experienced, time)
    • VR scenario, two levels (A and B)
    • Time (start, 5 minutes in, 10 minutes in, 15 minutes in, end, etc)
  • 1 dependent variable (measure of discomfort) with multiple measurements taken per subject over time (repeated measures).

The first important question is the nature of your measure of discomfort. I assume you'll use something like a Likert-scale (on a scale from 1 to 5, with how much discomfort do you feel, with 1 being mild/no discomfort and 5 being extreme discomfort), so the measure will be parametric and on an interval/ratio scale. This is the most common (and usually the most useful) method, so we'll pretend that's what you had in mind.

You'll also of course have two possible ways to setup the human participants: 1) have every participant experience both conditions (you'll probably want to use counter-balanced ordering of experimental conditions), or 2) participants only experience one VR condition (either Scene A or Scene B, not both). You can do either one, but generally you'll need less participants if you go with option 1, as there will be less variance due to between-person factors (your experience of "carrots vs celery" will be more similar than "your experience of carrots" vs "my experience of celery", after all). You can use option 2 if necessary and this won't really change the test, but is generally avoided unless you have a good reason (like learning effects are just too great, etc).

If I've described your experimental scenario accurately, the most common test used for this is a two-way repeated measures ANOVA. This will allow you determine first if there is any statistically significant difference in any of the conditions (taking care of the issues you'd have by running repeated t-tests), and then the post-hoc tests will allow you to identify just what conditions are different from each other. If you decided to test only one VR scenario, then you'd use a one-way repeated measures ANOVA instead.

You might also reasonably ask a question like, "we also wonder how excited the participants feel", in which case you'd have participants respond to both a measure of discomfort and a measure of excitement. In this case you'd be adding a dependent variable, excitement, and this would change the test you'd need. In such a case what you'd likely want is a two-way repeated measures MANOVA if you kept both VR conditions, or if you dropped to one you'd just want a (one-way) repeated measures MANOVA. The more questions you ask (dependent variables) the more power you lose, so make sure you actually care about all the measures and don't just add them in willy-nilly.

Someone might be tempted to include one more independent variable, but generally I strongly warn you to avoid that temptation unless you have a lot of experience with such a beast, as heaven forbid you end up with a 3-way interaction of variables and need to interpret what is going on in a sensible fashion. It can get really messy and end up muddying the waters rather than clarifying them.

You'll naturally want to make note of all the assumptions of your chosen test, and SPSS will help you test the assumptions as well. These sorts of tests are very common in areas like HCI and cognitive psychology, and are not at all exotic. There are surely other approaches that could be used, but these are the classic approaches which are popularly published in these fields.

Related Question