# Solved – bias of peeking at AB test data and adjusting minimum detectable effect

ab-teststatistical significance

Let's say we're running an A/B test on my website comparing blue button clicks (baseline) to green button clicks.

• I use http://www.evanmiller.org/ab-testing/sample-size.html to calculate my required number of subjects per branch with the following parameters:

• significance level of 5%
• statistical power of 80%
• an observed historical baseline conversion rate of 5%
• a desired minimum detectable effect of 1% (ie. conversions between 4% and 6% will be indistinguishable from the baseline)

Using the calculator, I determine that we need 7,663 pageviews to declare a result.

Now let's say everyone gets impatient and decides to check in on the experiment after only 900 pageviews.

The Game Plan:

1) If it turns out that the green button is at least 3% better than baseline, we will decide to conclude the experiment and declare the green button as the winner (a 3% MDE given the same other initial parameters requires only 894 pageviews according to the calculator).

2) If it turns out that the green button is less than 3% better than baseline after 900 pageviews, we will decide to keep the experiment running to it's full course of 7,663 pageviews and then make a conclusion at that time.

Are we introducing bias with this Game Plan?

Setting your stopping condition based on the significance of your interim analyses is, in general, not a great idea. The worst possible thing you could do would be to re-run your analysis after every page view and stop as soon as you got a significant result. You've decided, by setting your $\alpha=0.05$, that you're willing to tolerate a 5 percent chance of making a Type I error (i.e., claiming there's an effect even though there actually is not). This repeated "peeking" inflates that more than 5-fold, so there is actually a 1:4 chance your effect is due to random noise, rather than a 1:20 on. That is clearly bad. By peeking only once instead, you're not doing nearly as badly, of course.