Solved – AB Testing vs Hypothesis Testing

ab-testhypothesis testing

I'm having trouble making the full connection between AB Testing and Hypothesis testing.

Hypothesis Testing
Conducting a test to compare two means, we need the standard deviation of both samples, and the two means.
H0: Mean of sample one equals mean of sample two.

My understanding of the intuition is we are testing the odds that the results are just coincidence (assuming H0 is true). Therefore the more extreme results the more likely we are to reject the null.

AB Testing
Here we are asked for the Minimum Detectable Effect, Baseline Conversion Rate, significance level and statistical power. I think MDE is setting up a confidence interval but I don't really understand.

My question is, how do these two tests relate to each other? Is an AB Test truly just a test of means under the hood?

Best Answer

A/B testing is exactly the same as a Randomised Control Trial. All the methodology transfers directly from one field to the other, just A/B testing is more prevalent in the IT field and RCT in Biostatistics/Life sciences. (e.g. see Kohavi et al. (2004) Front line internet analytics at Amazon.com where A/B test are directly presented as: "Control/treatment(s) test for limited time")

This being said and quoting Akobeng (2005) Understanding randomised controlled trials directly: "The RCT is the most scientifically rigorous method of hypothesis testing available, and is regarded as the gold standard trial for evaluating the effectiveness of interventions." i.e. RCTs (and by equivalence A/B tests) are just a method to conduct hypothesis testing. There is no dichotomy between the two. They are other ways of performing hypothesis testing (e.g. case control studies that are based on observational data) but RCTs (or A/B tests) are the one accepted as the "best" way. (Somewhat simplistically RCTs are consider "best" because the offer a way to "insulate test from external factors" (Kohavi et al. 2004)).

Finally, please note that A/B tests (or RCTs) do not "just test of means under the hood". We might be interested in other statistics (e.g. median, variance, etc.) too, especially if our sample has strong deviations from normality. Deaton & Cartwright (2018) Understanding and misunderstanding randomized controlled trials present an excellent contemporary commentary on the matter.

Best Answer

Related Solutions

Solved – bias of peeking at AB test data and adjusting minimum detectable effect

Solved – How to perform hypothesis testing for comparing different classifiers

Related Question