I have a bunch of websites websiteA….websiteZ for example. Now I have an optimization technique that possibly improves the performance of these websites.
I want to check whether this technique actually significantly effects the performance. So my dependent variable is apply_technique with the treatments enabled and disabled.
So for each website I measure the performance with and without the optimization technique applied. So I get results like this:
website | technique disabled | technique enabled
websiteA 20 seconds 17 seconds
.....
websiteZ 45 seconds 39 seconds
etc.
However, to account for possible fluctations I measured each (website, treatment) combination 5 times so for websiteA I have 5 measurements with the technique disabled and 5 with the technique enabled.
This then results in 26 websites * 2 treatments * 5 repetitions = 260 measurements.
My question is, when I want to do a paired t-test do I first need to average the performance over these 5 trials or not? I might lose some information when I average it right?
Could I also decide to not average them? So I have 130 technique_disabled observations and 130 technique_enabled observations and I then simply use these to do a paired t-test? Would that be acceptable?
EDIT: I see that the performance differences (technique_enabled – technique_disabled) are not normally distributed. So I will probably use a Wilcoxon signed rank test. However here the same question applies? Should I average the observations over the 5 trials or not?
Best Answer
Averaging the data will result in a loss information and statistical power, so it is best avoided.
Since you have repeated measures for websites, you can account the differences between websites (or equivalently, the non-independence of observations within each website, since observations on one website are more likely to be similar to each other than to observations on other websites), by fitting random intercepts for website ID in a regression model (a mixed effects regression model). This would look something like:
and you could fit such a model using
lmer
from thelme4
package. This way you will make maximal use of the data.