[Math] Asymptotics for the expected length of the longest streak of heads.

asymptoticsgenerating-functionsprobabilityrecurrence-relations

As Introduction to Algorithms (CLRS) describes, the problem is

Suppose you flip a fair coin $n$ times. What is the longest streak of consecutive heads that you expect to see?

The book claims that the expects is $\Theta(\log{}n)$, and proves that it is both $O(\log{}n)$ and $\Omega(\log{}n)$. I want generize the problem, and look into the $\Theta(\log{}n)$.

For example, we're flipping a biased coin with the probability $p$ for head and $q$ for tail, where $p+q=1$. Supposing that the length of the longest streak is $X$, we want to rewrite $EX = A\log{}n + O(1)$, or more precisely, $EX = A\log{}n + B + o(1)$, or something else. How to determine the asymptotics for $EX$?

There's something trivial. Supposing that $P_{n,m}=\textrm{ probability that }X<m$, we have $P_{n,m}=qP_{n-1,m}+pqP_{n-2,m}+\cdots+p^{m-1}qP_{n-m,m}$ for $n \ge m$, and $P_{n,m}=1$ for $n<m$, thus $P_{n,m}$ could be solved out (linear difference equation), but there might be no useful formulas to determine the roots of $1-z-\cdots-z^m=0$ where $m>0$.

Any help? Thanks!

Best Answer

Suppose, $p \in (0; 1)$ is the probability of a single coin rolling heads. Let's call $L_n$ the length of the longest streak of heads in first $n$ rolls, $T_n$ - the number of rolls before the $n$-th tails and $X_i$ the length of the streak of heads just before the $i$-th tails. It is not hard to see, that $X_i$ are i.i.d. geometrically distributed with parameter $(1-p)$, that $T_n$ are almost surely monotonously increasing, and that $T_n = n + \sum_{i = 1}^n X_i \geq n$. We also know that $L_{T_n} = \frac{\ln(n)}{\ln(p^{-1})}+O(1)$. Thus we have:

$$ \begin{align*} \lim_{n \to \infty} \frac{E[L_n]}{\ln(n)}&=\lim_{n \to \infty} E\left[\frac{L_n}{\ln(n)}\right]\\ &= \lim_{n \to \infty} E\left[\frac{L_{T_n}}{\ln(T_n)}\right]\\ &= \lim_{n \to \infty} E\left[\frac{L_{T_n}}{\ln(n + \sum_{i = 1}^n X_i)}\right]\\ &=\lim_{n \to \infty} E\left[\frac{L_{T_n}}{\ln(n) + \ln\bigl(1 + \frac{\sum_{i = 1}^n X_i}{n}\bigr)}\right]\\ &= \lim_{n \to \infty} E\left[\frac{L_{T_n}}{\ln(n) + \ln(1 + E[X_1])}\right]\\ &= \lim_{n \to \infty} \frac{E[L_{T_n}]}{\ln(n) + \ln(1 + p^{-1})}\\ &= \lim_{n \to \infty} \frac{\frac{\ln(n)}{\ln(p^{-1})}+O(1)}{\ln(n) + \ln(1 + p^{-1})}\\ &= \frac{1}{\ln(p^{-1})}. \end{align*} $$

From that we can conclude, that $\forall p \in (0; 1)$, $E[L_n]$ is asymptotically equivalent to $\frac{\ln(n)}{\ln(p^{-1})}$, which seems to be the thing you wanted to prove.