[Math] Variance of number of runs of consecutive heads in $n$ biased coin flips

probability

Suppose we have a sequence of $n$ coin flips of a biased coin with probability $p$ of being heads. Let $X_n$ be the random variable that records the number of runs of consecutive heads, e.g. THHTHTHH has 3 runs of heads. By linearity of expectation, $E(X_n)$ is the sum of $n$ probabilities of getting a run of heads starting at coin flip $j$ for $1 \leq j \leq n$. For $j = 1$ this probability is $p$, and for $j > 1$ we must have flip $j-1$ is tails and flip $j$ is heads, which has probability $p(1-p)$. So $E(X_n) = p + (n-1)p(1-p)$. What about Var$(X_n)$? Is there any way to compute a formula for it? Or if not, at least get asymptotics for it?

Best Answer

Following the idea of your derivation: let $t_i$ be $1$ if a run of heads starts at coin number $i$ ($i=1 \cdots n$), $0$ otherwise. Then $X_n = \sum_{i=1}^n t_i$ and

$$ P(t_i=1)=E(t_i)= \begin{cases} p & \text{if $i=1$} \\ p(1-p) & \text{otherwise} \\ \end{cases} $$

The vars $t_i$ are not independent, but $E(X_n)=\sum E(t_i) = p + (n-1)p (1-p)$ as you already found. Now let's compute $$E(X^2_n)= E\left[\left(\sum t_i \right)^2\right]=\sum_{i=1}^n\sum_{j=1}^n E(t_i t_j)$$

The terms of the sum can be grouped in:

A) $i=j$: (with $n$ terms) $E(t_i t_j)=E(t_i^2)=E(t_i)$. The sum over these terms is $p + (n-1)p (1-p)$.

B) $|i-j|=1$: (with $2 \times (n-1)$ terms). Here one of the two must be zero,so $E(t_i t_j)=0$

In the remainder, the variables $t_i,t_j$ are independent. Hence

C) $|i-j|>1$ and $i=1$ or $j=1$ : (with $2\times(n-2)$ terms) $E(t_i t_j) = p^2(1-p)$

D) $|i-j|>1$ and $i>1$ and $j>1$ : (with $n^2-5n +6$ terms) $E(t_i t_j) = p^2(1-p)^2$

Finally

$$E(X_n^2)=p+\left( n-1\right) \,\left( p-1\right) \,p+ 2\,\left( n-2\right) \,\left( 1-p\right) \,{p}^{2}+\left( {n}^{2}-5\,n+6\right) \,{\left( 1-p\right) }^{2}\,{p}^{2}$$

$$Var(X_n) = E(X_n^2) - E(X_n)^2 = \left( 1-p\right) \,p\,\left( 3\,n\,{p}^{2}-5\,{p}^{2}- 3 \, n\,p+3 p+n\right)=\\ =\left( 1-p\right) \,p\,\left[ n\,\left( 3\,{p}^{2}-3\,p+1\right) -p\,\left( 5\,p-3\right) \right]$$

As pointed out by Did in a comment, this is valid for $n\ge 4$; if $n<4$ the group D vanishes; if $n\le 2$, only case A survives. Then

$$Var(X_1)=p(1-p) \hspace{9pt} Var(X_2)=p(1-p)^2(2-p)\hspace{9pt} Var(X_3)=p(1-p)^2(4p^2-6p+3)$$

Related Question