Type-I vs. type-II error in statistical hypotheses testing

binomial distributionhypothesis testingp-valuestatistical-inferencestatistics

Let us consider standard statistical hypotheses testing:

$$\alpha=P\{\text{type}-I \text{ error}\}=P\{\text{Rejecting } H_0 \text{ when }H_0\text{ is true}\}$$

and

$$\beta=P\{\text{type}-II \text{ error}\}=P\{\text{Accepting } H_0 \text{ when }H_1\text{ is true}\}.$$

My question is as follows: could you give an example of $\alpha$ and $\beta$
when tossing 10 coins so that I can see that it does not hold this equation:

$$\alpha=1-\beta.$$

More specifically, compose an outcome so that inequality between $\alpha$ and $1-\beta$ is seen.

Best Answer

Suppose you have two boxes of dice, one is a box of fair dice in which all faces are equally likely. The other has loaded dice for which the probability of getting a six is is 1/3. The labels are missing so you will roll a sample of 50 dice from each box to try to identify which box has the loaded dice.

Let $H_0: \text{FAIR},$ so that $p_0(6) = 1/6$ and let $H_a: \text{LOADED},$ so that $p_a(6) = 1/3.$

In the figure below, blue bars represent the null distribution under which the number of 6's seen in $n = 50$ trials is $\mathsf{Binom}(n = 50,\, p = 1/6).$ And let the brown bars represent the alternative distribution under which the number of 6's seen is $\mathsf{Binom}(n = 50,\, p = 1/3).$

enter image description here

You choose critical value $c = 10.5$ (dotted line). Thus $$\alpha = P(S \ge 11 \,|\, p=1/6) = .2014,$$ and $$\beta = P(S \le 10 \,|\, p = 1/3) = .0284.$$ Thus the 'power' of the test is $$1 - \beta = P(\text{Rej } H_0 | H_a \text{ True}) = P(S \ge 11 | p=1/3) \\ = 1-.0284 = .9716.$$

sum(dbinom(11:50, 50, 1/6))
[1] 0.2013702
sum(dbinom(0:10, 50, 1/3))
[1] 0.02844031

Note: In many practical applications it seems reasonable to design an experiment so that the significance level $\alpha$ is approximately the same as the power $1 - \beta.$

Here, perhaps you chose to make them different because you think it would be more serious to sell a loaded die to a customer who wants a fair one, than to sell a fair die to someone in the market for a loaded one.

[If you wanted significance level and power to be more nearly equal, you could pick the critical value $c$ to be near the middle of the region where the two distributions 'overlap'.]

Related Question