The proof of “Infinite monkey theorem”, What does “any of the first” n blocks of 6 letters mean

measure-theoryprobabilitysequences-and-series

This wiki page gives an explanation of "Infinite monkey theorem".

Suppose the typewriter has 50 keys, and the word to be typed is banana. If the keys are pressed randomly and independently, it means that each key has an equal chance of being pressed. Then, the chance that the first letter typed is 'b' is 1/50, and the chance that the second letter typed is a is also 1/50, and so on. Therefore, the chance of the first six letters spelling banana is

$(1/50) × (1/50) × (1/50) × (1/50) × (1/50) × (1/50) = (1/50)^6 = 1/15
625 000 000 $
, less than one in 15 billion, but not zero.

From the above, the chance of not typing banana in a given block of 6 letters is $1 − (1/50)^6$. Because each block is typed independently, the chance $X_n$ of not typing banana in any of the first n blocks of 6 letters is

${\displaystyle X_{n}=\left(1-{\frac {1}{50^{6}}}\right)^{n}.}$

Assume [a-z] represents a letter that to be randomly chose from 50 possible letters and symbols.

And then the sequence sequence_6_letters = ([a-z],[a-z],[a-z],[a-z],[a-z],[a-z]) represents a block of 6 letters.

what is the form of any of the first n blocks of 6 letters in this representation?

I guess 2 blocks of 6 letters could be (sequence_6_letters, sequence_6_letters) = (([a-z],[a-z],[a-z],[a-z],[a-z],[a-z]),([a-z],[a-z],[a-z],[a-z],[a-z],[a-z])), that is, a 12 successive letters.

What I cannot imagine is any of the first n blocks of 6 letters, what does that mean?

Note: What I highlight is the "any of the first" part other than "n blocks" part 🙂

Can anyone give an example (in [a-z] representation or any other easy-to-understand representation) to illustrate it?

Best Answer

They divide the text into blocks of $6$ letters and look if each block contains 'banana'. This is to avoid correlations. The first $6n$ letters produce the first $n$ blocks. If the first letter typed is 'c' we already know the first block will not contain 'banana'. We then ignore the next five characters and ask if the next six characters are 'banana'. This will not count a string that starts with 'cbananazzzzz' as a success for typing 'banana', but it doesn't matter to the argument. $X_n$ counts the probability that the first $6n$ characters typed do not have an instance of 'banana' where the $b$ starts in a position $1 \bmod 6$.