Hypothesis Testing – Two-Sample Proportions Test and 10 Successes and Failures Rule

hypothesis testingtwo-sample

I asked this question on mathematics stack exchange, and reposting here as it did not get resolved.

Suppose the sample sizes and number of successes are ($n_1, y_1$) and ($n_2, y_2$), for the two samples, respectively.
Let the true proportions of successes be $p_1, p_2$.

Null hypothesis $H_0$: $p_1-p_2 = 0$

Alternative hypothesis $H_a$: $p_1 – p_2 \ne 0$

Everywhere I have seen, it is required that both samples need to have at least $10$ successes and failures. I understand that a binomial to be approximated by a normal distribution needs to have that condition met for a single distribution.

Here, the null hypothesis is that $p_1 = p_2$. Then the estimate for the true proportion $p$ under that is $\hat p = \frac{y_1 + y_2}{n_1 + n_2}$. Is it not enough that the number of combined successes and failures meet $y_1 + y_2 > 10 $ and $n_1 + n_2 – y_1 – y_2 > 10$, if in addition $\hat p n_1, (1 – \hat p) n_1, \hat p n_2, (1 – \hat p) n_2 > 10$? Under the null hypothesis, then, would that imply that the individual samples are drawn from approximately normal distributions. This would then further imply that $\hat p_1 – \hat p_2$ is approximately normally distributed.

An example is as follows. Suppose $n_1=n_2=50, y_1=7,y_2=13$. Then $\hat p =(7+13)/100=0.2$. So $\hat p n_1=\hat p n_2=10$, and $(1− \hat p)n_1=(1−\hat p)n_2=40$.

Best Answer

There's no hard and fast rule here. It's all just personal preference, when people draw the line of "close enough" to be approximated by one thing or another.

If you feel more comfortable with your slightly more complicated rules of thumb, feel free to disregard the more conservative, common ones.

When you report your results, whether you use more complex rules of thumb, or more conservative ones, be explicit about why you think a particular approximation is warranted.


Another option is to not approximate at all. If you have access to a computer you can probably run a computer intensive permutation test instead, which makes no assumptions on distributions other than "the observed values are representative" -- but that's the one assumption you generally can't avoid making.

Related Question