Paired t-test with multiple observations per pair

paired-datarrepeated measurest-testwilcoxon-signed-rank

I have a bunch of websites websiteA….websiteZ for example. Now I have an optimization technique that possibly improves the performance of these websites.
I want to check whether this technique actually significantly effects the performance. So my dependent variable is apply_technique with the treatments enabled and disabled.

So for each website I measure the performance with and without the optimization technique applied. So I get results like this:

website | technique disabled | technique enabled
websiteA      20 seconds          17 seconds
.....
websiteZ      45 seconds          39 seconds

etc.

However, to account for possible fluctations I measured each (website, treatment) combination 5 times so for websiteA I have 5 measurements with the technique disabled and 5 with the technique enabled.
This then results in 26 websites * 2 treatments * 5 repetitions = 260 measurements.

My question is, when I want to do a paired t-test do I first need to average the performance over these 5 trials or not? I might lose some information when I average it right?

Could I also decide to not average them? So I have 130 technique_disabled observations and 130 technique_enabled observations and I then simply use these to do a paired t-test? Would that be acceptable?

EDIT: I see that the performance differences (technique_enabled – technique_disabled) are not normally distributed. So I will probably use a Wilcoxon signed rank test. However here the same question applies? Should I average the observations over the 5 trials or not?

Best Answer

Averaging the data will result in a loss information and statistical power, so it is best avoided.

Since you have repeated measures for websites, you can account the differences between websites (or equivalently, the non-independence of observations within each website, since observations on one website are more likely to be similar to each other than to observations on other websites), by fitting random intercepts for website ID in a regression model (a mixed effects regression model). This would look something like:

 apply_technique ~ treatment + (1 | website_ID)

and you could fit such a model using lmer from the lme4 package. This way you will make maximal use of the data.

Related Question