Probability – Does the Law of Truly Large Numbers Hold with Decreasing Probability?

binaryprobabilitysequences-and-series

Recently I've been considering a game (I'm unsure if it has a proper name and is an already studied problem), and it proceeds as follows.

  1. Flip a coin, selecting a value of either 0 or 1.
  2. Append the result to the sequence of past flips.
  3. If the last flip has marked the repetition of the past history of the sequence (i.e. the first half of the sequence now equals the second half), the game terminates.
  4. Otherwise, keep playing the game.

Some example runs where the game terminates might be 11, or 00, or 101101, or 1001110011, etc. But the longer sequences get exponentially less likely to be repeated, for instance had the last example ended instead with a 0, we would then need to just by chance get the precise sequence of subsequent flips 1001110010 to form 1001110010[|]1001110010.

Thus after a certain point, it seems as though we are bound to keep failing forever, eternally making the required sequence we must repeat to finish the game longer and longer. And running some basic simulations of 1000 different random seeds for this game reveals a curious distribution of game lengths:

Game Length Occurrences (/1000)
2 521
4 127
6 59
8 15
10 13
12 3
14 1
16 2
18 1
22 1
> 1,000,000 257

Here the final category the game was manually terminated after iterating beyond 1 million in length (as these sequences seem to run forever or for a very long time). The rarest game lengths are actually the intermediate length games, with a decent portion falling into this exceedingly long case instead.

This pattern is interesting to me, and my question ultimately is, do these sequences truly run forever? Past a certain point does this probabilistic game become truly hopeless, even though at each step as the sequence gets longer there is a non-zero probability of the past history of the sequence happening to exactly reoccur?

I liken this game by analogy to moving twice as far away from a quantum tunnelling particle every time it fails to tunnel to your location, which also seems to imply it would never catch you even given infinite time. Or perhaps if the works of Shakespeare expanded by an additional character with each wrong keypress by the monkey at the typewriter, multiplying the difficulty.

But at the same time this logic contrasts with my intuition about the law of truly large numbers, which says given infinite time any events with non-zero probability will eventually happen. So how is the law affected in the case where the probability in question is not constant, but eternally decreasing (halving with each failure in this case)?

Edit: As an interesting aside, if we instead roll a 10 sided die, the result from the long runs becomes a sequence of random digits, e.g. 943669542398…; and I notice by definition this sequence avoids repetitions, and so if we put a decimal point before it, 0.943669542398… for instance, we are essentially constructing an irrational number. The distribution also shifts such that the vast majority of runs end up in the long seemingly endless case.

Best Answer

Question: Do these sequences truly run forever?

Answer: Yes. There is a finite probability that the process never ends. Let $E_n$ be the event that the game ends after exactly $2n$ flips. Note that $$ n\ge 2\implies P(E_n)\le 2^{-n}-2^{-(n+2)} $$ In order for $E_n$ to occur, the last $n$ flips must match the first $n$ flips, which occurs with probability $2^{-n}$. However, the $2^{-n}$ calculation includes sequences that would have ended after fewer than $2n$ flips. For example, any sequences which looks like this will be counted, even though it would have ended after the first two flips. $$ 0,0,\underline{\;\;\;\;\;\;\text{series of $n-2$ flips}\;\;\;\;\;\;},0,0,\underline{\;\;\;\;\;\;\text{same series of $n-2$ flips}\;\;} $$ The probability of the above sequence is $2^{n-2}/2^{2n}=2^{-(n+2)}$. There are many other cases which need to be subtracted. However, subtracting only the above case gives an upper bound.

We then apply the union bound; the probability of at least one of the events $E_1,E_2,E_3,\dots$ occurring is at most the sum of their probabilities. This sum is at most $$ P(\text{process eventually ending})\le \sum_{n=1}^\infty P(E_n)\le \frac12+\sum_{n=2}^\infty (2^{-n}-2^{-(n+2)})=\frac12+\frac12-\frac{1}{8}\le \frac78 $$ In particular, with probability at least $1/8=12.5\%$, the process will continue forever. Above, I used the infinite geometric series formula to compute the infinite sum.

Question: How does this relate to the law of large numbers, which says that infinitely repeated events will certainly eventually occur?

Answer: In general, if you have a series of independent events $E_1,E_2,\dots$ with the same nonzero probability, then you can be certain that these events will occur infinitely often. However, for your problem, the events are dependent. Also, the probabilities are not the same, but are decreasing over time. These two difference are why you get a different result.

It is possible to weaken the assumption about the same nonzero probability. In general, if $E_1,E_2,\dots$ is a sequence of independent events such that $\sum_{n=1}^\infty P(E_n)=+\infty$, then you can conclude that infinitely many of the events will occur. This is the second Borel-Cantelli lemma. For example, if $P(E_n)=1/n$, then you can conclude that there will be infinitely many occurrences.

Related Question