Approximation to the monkey typewriter probability

probability

Say that we are interested in the problem of a monkey randomly typing out a specific string of length m on a keyboard with k keys, in n trials, at least once:

$$
1 – \left(1 – \frac{1}{k^{m}}\right)^n
$$

I would like to get some intuition about how this expression varies for m and n (with k around 26).

  1. Is there an approximation we can use in this instance?
  2. I would like to understand how this probability varies as we add one more character – by how much must we increase $n$ to keep the probability the same $?$.
  3. Related to $1.$: How do I compute the probability for the case where $k = 26, m = 50, n = 1,000,000\ ?$. I couldn't do it in Python.

Thank you

Best Answer

A experiment with real monkeys suggested that the keys pressed are not independent - in that case they repeated letters a lot

Back to your question, it might be sensible to subtract $k-1$ from $n$ as you cannot get your desired string of $k$ characters with in the first $k-1$ attempts.

Ignoring that point, an approximation with large $n$ is $$1 - \left(1 - \frac{1}{k^m}\right)^n \approx 1- e^{-n/k^m} $$

If you want the right hand side to stay constant as $k$ increases by $1$ then you want $\frac{n_1}{(k+1)^m} \approx \frac{n_0}{k^m}$ so you want $\frac{n_1}{n_0} \approx \left(1+\frac1k\right)^m$.

Whether $e^{m/k}$ is a good approximation to that ratio depends on the particular values of $m$ and $k$

Related Question