Infinite monkey theorem independent of number of monkeys

combinationscombinatoricsstatistics

I was just thinking about the infinite monkey theorem and arrived at a creepy conclussion. Let's begin with the probability that one monkey types a set of $N$ letters with a chosen order, which is given by $P = \left( \frac{1}{N_k} \right)^N$, being $N_k$ the number of keys available in a hypotetical typewriter (for example, the probability of a monkey typing "MAMA" in a 26-key typewriter would be $P = \left( \frac{1}{26} \right)^4 \simeq 2\times 10^{-6}$).

If we have $N_m$ monkeys typewriting, we could think that each monkey $i$ would be required to type only a fraction $N_i = N/N_m$ of the $N$ ordered letters, so that we could arrange later the $N_m$ fractions and create the $N$-letters original sequence. The total probability of every monkey achieving its task is given by the product of each monkey's probability, $P = \left( \left( \frac{1}{N_k} \right)^{N_i} \right) ^{N_m} = \left( \frac{1}{N_k} \right)^N$, which is independent of $N_m$.

UPDATE: Now I know I have done nothing wrong, but I would like to derive the relation between $P$ and $N_m$ in order to analytically see how the $P$ improves as $N_m$ gets bigger.

Best Answer

For a concrete example, let's use your example, the string MAMA, typed on a $26$=key keyboard.

As you say, the chance that the first four letters typed by one monkey will spell MAMA is $\left(\frac1{26}\right)^4 \approx 2.2\times 10^{-6}.$

The chance that the monkey does not type MAMA is $1 - \left(\frac1{26}\right)^4 \approx 0.9999978.$

But now let's put a million monkeys in front of typewriters and examine the first four letters that each one of them types.

When we look at the first four letters typed by any single monkey, there is still only a $2.2\times 10^{-6}$ chance that those four letters spell MAMA. But the chance that at least one of the million monkeys typed MAMA is one minus the probability that ever single monkey typed something else.

The probability that any single monkey typed something else is still $1 - \left(\frac1{26}\right)^4 \approx 0.9999978,$ but for all one million of them to type things other than MAMA, the probability is

$$ \left(1 - \left(\frac1{26}\right)^4\right)^{1000000} \approx 0.112$$

and therefore the probability that at least one will type MAMA is approximately $0.888.$

If you increase the number of monkeys to ten million, the probability they all type something other than MAMA is $$ \left(1 - \left(\frac1{26}\right)^4\right)^{10000000} \approx 3\times10^{-10}$$ and therefore the probability of at least one MAMA is approximately $0.9999999997.$

Often the problem is formulated so that you not only have an extraordinary number of monkeys (if not infinite) but they keep on typing forever and you only need to find MAMA somewhere in one of the monkey's output, not necessarily in the first four letters. This probability converges to $1$ very quickly.

If you choose to look for a much longer string, such as the entire works of Shakespeare, the probability of observing it from a finite number of monkeys will be less for the same number of monkeys and the same amount of time, but it too eventually converges to $1.$


In any event, the typical formulation has nothing to do with assembling the string from the output of multiple monkeys. You have to find the string in the output of just one monkey. The trick is that you get to throw away and ignore the output of all the other monkeys who typed something else.

Related Question