Standard Error – Proving the Standard Error Between Two Normal Distributions in A/B Testing

ab-testbinomial distributionhypothesis testingnormal distribution

Classic A/B test suppose that there are two independent experiments, each with $n_1$ and $n_2$ observations, for which we are interested in an event following a binomial distribution $\mathcal{B}(n_1,p_1)$ and $\mathcal{B}(n_2,p_2)$. If we suppose that the central limit theorem is valid for each experiment then, by respectively dividing by $n_1$ and $n_2$, we can assume that the event in each experiment will follow a normal distribution $\mathcal{N}(\hat{p_1}, \frac{\hat{p_1}(1-\hat{p_1})}{n_1})$ and $\mathcal{N}(\hat{p_2}, \frac{\hat{p_2}(1-\hat{p_2})}{n_2})$.

The A/B test looks for a significant difference between $p_1$ and $p_2$ by taking the difference between the two former distributions. This gives a normal distribution $\mathcal{N}(\hat{p_1}-\hat{p_2}, \hat{p}(1-\hat{p})(\frac{1}{n_1}+\frac{1}{n_2}))$, where $\hat{p}=\frac{n_1\hat{p_1}+n_2\hat{p_2}}{n_1+n_2}$.

The formula of this variance is given in various ressources such as :

I know that the variance given by the difference of two normal distributions is equal to the sum of the variances, and I would like to prove the formula of the variance:
$$\hat{p}(1-\hat{p})(\frac{1}{n_1}+\frac{1}{n_2})=\frac{\hat{p_1}(1-\hat{p_1})}{n_1} + \frac{\hat{p_2}(1-\hat{p_2})}{n_2}$$

Best Answer

As whuber pointed out the stated equality is false in general. What the expression $\hat{p} (1 - \hat{p})(n^{-1}_{1} + n^{-1}_{2})$ represents is our estimate of the variance of the test statistic when both probabilities are equal, since that is the null hypothesis of the test.

If we assume that $p_1 = p_2$ then our best guess of this common probability is the pooled sample proportion $\hat{p}$, and using this to estimate the variance of $\hat{p}_1 - \hat{p}_2$ gives the formula you reference after assuming independence.