Evan Miller has created a well-known online AB test sample size calculator. For the sake of being able to program and modify this formula, I would like to know how to calculate sample size Evan Miller-style by hand.
Personally, I'll calculate such a metric by working backwards from how we calculate a 95% confidence interval with the z-test of proportions around the difference in conversion between the two variations ($\hat{d}$) by setting it zero.
I'll define/assume:
- $\alpha$ = .05, $\beta$ = .2
- a 50/50 split between the control and experiment, i.e. $n\_exp$ = $n\_control$
- the control conversion rate, i.e. the base rate before the experiment = $c$
- $p$ = pooled conversion rate = (number of exp conversions + number of control conversions/ (n_control + n_experiment)) -> in this context -> $(nc+n(c+\hat{d}))/2n$ = $(2c+\hat{d})/2$
Now time to solve for $n$ …
$$ \hat{d} + Z_{(1+\alpha)/2} * StandardError = 0 $$
$$ \hat{d} + 1.96 * StandardError = 0 $$
$$ \hat{d} + 1.96 * \sqrt{p(1-p)(\frac{1}{n\_exp} + \frac{1}{n\_control})} = 0$$
$$ \hat{d} + 1.96 * \sqrt{p(1-p)(\frac{2}{n})} = 0$$
$$ \sqrt{p(1-p)(\frac{2}{n})} =\frac{-\hat{d}}{1.96}$$
with more simplifying we get to:
$$ \frac{(1.96^2) 2p(1-p)}{\hat{d}^2} = n $$
$$ \frac{(1.96^2) (2c+2c\hat{d}-2c^2+\frac{3}{2}\hat{d}^2)}{\hat{d}^2} = n $$
At the moment though, my calculation doesn't incorporate power (1-$\beta$) but Evan Miller's does.
What should I think about as next steps to incorporate power into my sample size calculation?
(Feel free to also point out other errors in my calculation or assumptions!)
Best Answer
try this: $$ n=\frac{(Z_{\alpha/2}\sqrt{2p_1 (1-p_1)}+Z_{\beta}\sqrt{p_1(1-p_1)+p_2(1-p_2)})^2}{|p_2-p_1|^2} $$ where:
$p_1$ is the "Baseline conversion rate"
$p_2$ is the conversion rate lifted by Absolute "Minimum Detectable Effect", which means $p_1+\text{Absolute Minimum Detectable Effect}$
$\alpha$ is the "Significance level $\alpha$"
$\beta$ is the $\beta$ in "Statistical power $1−\beta$"
$Z_{\alpha/2}$ means Z Score from the z table that corresponds to $\alpha/2$
$Z_{\beta}$ means Z Score from the z table that corresponds to $\beta$
in Sample Size Calculator (Evan’s Awesome A/B Tools)
I found the formula in A/B测试系列文章之怎么计算实验所需样本量
when I choose:
I got 1030.219283 using this formula, which is 1030 in Size Calculator (Evan’s Awesome A/B Tools)