[Math] Assess if coin is fair based on length of longest run

probabilitystatistics

A coin is tossed n times and is generating a sequence of heads and tails. A subsequence of consecutive heads is called a run of heads. For example in sequence HCHHHCCHHC there are two runs of lenght 3 and 2 and the longest run of heads is 3. For a sequence of length n = 1000 the longest run is equal to 6. Having this information do you believe the coin is fair or not?

Do you have ideas how to construct appriopriate test for such problem? Any help would be much appreciated. Thank you.

Best Answer

The natural test for a fair coin would be to compare the numbers of Heads and Tails, unless you wanted to include independent identical distributions in the null hypothesis as well (which you might if you did not conduct the test)

But you have said you want to look at the longest run

$6$ is rather short for the longest run in $1000$ tosses of a fair coin which is i.i.d. equally probably heads or tails. Since an i.i.d. unfair coin would tend to give longer runs, the doubt here could be that $6$ might come from a non-i.i.d process, possibly an attempt to fake a random sequence

You might intuitively guess that the longest run could be of the order of $\log_2(n)$ so about $10$ in this case ($10$ is indeed the median and mode here, and slightly below the expectation).

There are explicit expressions but it may be quicker to use a Markov calculation, producing something like the following, which suggests that at over $95\%$ confidence you might reject the null hypothesis when seeing longest runs of $7$ or fewer or of $16$ or more from a sequence of $1000$ results

run     prob
 6      0.0003
 7      0.0180
 8      0.1208
 9      0.2370
10      0.2387
11      0.1699
12      0.1014
13      0.0553
14      0.0289
15      0.0147
16      0.0074
17      0.0037
18      0.0019
19      0.0009
20      0.0005
21      0.0002
22      0.0001