Solved – Check whether a coin is fair

hypothesis testingself-study

An interview problem is like the following:

Given a coin you don’t know it’s fair or unfair. Throw it 6 times and
get 1 tail and 5 head. Determine whether it’s fair or not. What’s your
confidence value?

I came out the following solution:

$H_0:$ the coin is fair
$H_a:$ the coin is unfair

$X$: is the number of heads

Rejection region: $|X – 3| > 2$, i.e., $X = 0,1,5,6$

Significance level alpha:

$\alpha = P(\text{reject }H_0 \mid H_0 \text{ is true})$
$=P(X=0,1,5,6 \mid H_0 \text{ is true})$
$= (\binom{6}{0}+\binom{6}{1}+\binom{6}{5}+\binom{6}{6})*(1/2)^6$
$= (1+6+6+1)*(0.5^6) = 0.21875$

because $\alpha > 0.05$, we do not have enough evidence to reject $H_0$, and
we accept $H_0$, so the coin is fair

Is the above test a good one? And I did not know how to calculate the confidence value?

Best Answer

[I think I'd start by asking for a whiteboard, markers -- and an eraser, because one boardful isn't enough to explain everything wrong with the question.]

I'm going to answer this question by rejecting its premises.

  1. The "coin" itself is just a coin; by itself it doesn't do anything, and so it cannot be fair or not-fair. What we're talking about is the process of tossing a particular coin in some fashion -- that can be discussed in terms of whether it's fair or not.

  2. Data can't show you that a coin-tossing process applied to some coin is exactly fair. Sometimes it can show you that your coin-tossing-process on a given coin is inconsistent with fairness, but failure to identify any inconsistency with fairness doesn't imply fairness (failure to reject is because your sample size is small, not because the coin is actually fair).

    [e.g. Consider it in terms of a confidence interval for P(head), the fact that $\frac12$ is in the CI doesn't mean that P(head)=$\frac12$, since there are always other values - distinct from $\frac12$ - in there too. Or think in terms of power: on the experiment given in the interview question - 6 tosses - what's the probability that you'd reject as unfair the case where the tossing process applied to a particular coin had $p(\text{head})=0.51$ at some typical significance level? That's clearly an unfair coin, but you'll reject barely more often than your type I error rate, and a large fraction of those rejections in a two tailed test would be "in the wrong tail"!]

  3. No coin-tossing process on a given coin will be perfectly fair. (For example, changing the side facing up slightly alters the chances associated with the resulting face on the toss, as experiments run by Persi Diaconis have shown.)

    Could the coin be close to fair? Possibly; it may even be possible to get very close to fair. Exactly fair? No, it's not possible in practice. But then to discuss whether it's "close to fair" we'd have to define what we mean by 'close'. [If we were to give some usable definition, while some people might suggest some form of equivalence test, or perhaps considering whether some CI lay entirely inside some "close to fair" bounds, I'd be inclined toward a Bayesian approach to deciding whether the coin is sufficiently close to fair. Note that with the tiny sample size mentioned, the data are quite consistent with p(head) so far from $\frac{1}{2}$ that this exercise on that data would not conclude "close to fair" on any of the three mentioned approaches.]

So:

Given a coin you don’t know it’s fair or unfair.

Yes, actually, I do. In fact I don't even need to see data. It's not fair.

Throw it 6 times and get 1 tail and 5 heads. Determine whether it’s fair or not.

I really don't care what the data are. It makes no difference to my answer, since the data could not possibly demonstrate fairness, even if fairness were a realistically possible state to be in.

What’s your confidence value?

100% (in a sense similar to almost surely)

(In any case, even if there were a way to do this statistically I don't know of any statistical procedure that gives anything I'd agree to call "confidence values", so I also reject the form of that question. What does that term even mean? If I were asked a question phrased that way in an interview, I'd have serious concerns about working there, because it seems to suggest the people conducting the interview don't really understand what they're even asking - and that suggests either nobody there knows this stuff, or they don't care enough about this position to make sure the interview is being conducted by someone who does. Either way, it would certainly influence my willingness to work there.)


Forgetting everything I just said for the moment, some comments on your hypothesis test:

Your process for a hypothesis test is wrong.

  1. Why do you compare your significance level with 0.05? You've chosen a significance level of 0.21 (which I have no objection to in this experiment, the sample size is so low you only have 3% or 21% and $\alpha$=3% will be too low-powered to be much use) -- 0.05 doesn't relate to anything here.

  2. Do you see that in your test when it came time to reject or not reject, you made no reference at all to the sample statistic (5 heads)? Indeed you ignored your rejection rule.

  3. The rejection rule you stated algebraically $|X-3|>2$ is inconsistent with the rejection region you mentioned ($0,1,5,6$).

That's a lot of errors in a few lines! If I was involved in such an interview**, I might forgive the error with the rejection rule as something one could overlook under interview pressure, but the first two errors would suggest some fundamental problems.

** leaving aside that I'd never ask such a poor question, nor would I likely care enough about hypothesis testing to even think to ask a question about it.