Suppose we compute the generating function of binary strings having at
most $q$ consecutive heads. There are four cases, according to whether
the string starts with heads or tails and ends with heads or tails.
We get
$$G_{HH}(z) =
z\frac{1-z^{q}}{1-z}
\sum_{k=0}^\infty
\left(\frac{z}{1-z}z\frac{1-z^{q}}{1-z}\right)^k.$$
Continuing we get
$$G_{HT}(z) = G_{HH}(z) \frac{z}{1-z}.$$
Furthermore
$$G_{TT}(z) =
\frac{z}{1-z}
\sum_{k=0}^\infty
\left(z\frac{1-z^{q}}{1-z}\frac{z}{1-z}\right)^k.$$
Finally we have
$$G_{TH}(z) = G_{TT}(z) z\frac{1-z^{q}}{1-z}.$$
The sum term is
$$\frac{1}{1-z^2(1-z^q)/(1-z)^2}
= \frac{1-2z+z^2}{1-2z+z^2-z^2(1-z^q)}
= \frac{1-2z+z^2}{1-2z+z^{q+2}}.$$
The factor on this is
$$z\frac{1-z^{q}}{1-z} \left(1+\frac{z}{1-z}\right)
+ \frac{z}{1-z} \left(1+z\frac{1-z^{q}}{1-z}\right)$$
which is
$$z\frac{1-z^{q}}{(1-z)^2}
+ \frac{z}{(1-z)^2} (1-z^{q+1})
= \frac{2z-z^{q+1}-z^{q+2}}{(1-z)^2}.$$
Multiplying we obtain the generating function
$$G_q(z) = \frac{2z-z^{q+1}-z^{q+2}}{1-2z+z^{q+2}}.$$
It follows that the expectation times $2^n$ is given by
$$ [z^n]
\left(0\times G_0(z) + \sum_{q=1}^n q (G_q(z)-G_{q-1}(z)) \right).$$
The sum simplifies to
$$\sum_{q=1}^n q G_q(z)
- \sum_{q=0}^{n-1} (q+1) G_q(z)
= \sum_{q=0}^n q G_q(z)
- \sum_{q=0}^{n-1} (q+1) G_q(z)
\\ = n G_n(z)
- \sum_{q=0}^{n-1} G_q(z).$$
and hence the expectation is
$$\frac{1}{2^n} [z^n]
\left( n G_n(z)
- \sum_{q=0}^{n-1} G_q(z) \right).$$
This gives the sequence
$$1/2,1,{\frac {11}{8}},{\frac {27}{16}},{\frac {31}{16}},{
\frac {69}{32}},{\frac {75}{32}},{\frac {643}{256}},{\frac {
1363}{512}},{\frac {1433}{512}},\ldots$$
Multiplying by $2^n$ we obtain
$$1, 4, 11, 27, 62, 138, 300, 643, 1363, 2866, \ldots$$
which is OEIS A119706
where the above computation is confirmed.
The following Maple code can be used to explore these generating
functions. The procedure v computes the generating function of the
maximal run length of a string of $n$ bits by total enumeration. The
procedure w computes it from the generating function $G_q(z).$
v :=
proc(n)
option remember;
local gf, k, d, mxrun, len;
gf := 0;
for k from 2^n to 2^(n+1)-1 do
d := convert(k, base, 2);
mxrun := 0;
for pos to n do
if d[pos] = 1 then
len := 1;
pos := pos+1;
while pos <= n do
if d[pos] = 1 then
len := len+1;
pos := pos+1;
else
break;
fi;
od;
if len>mxrun then
mxrun := len;
fi;
fi;
od;
gf := gf + z^mxrun;
od;
gf;
end;
G := q -> (2*z-z^(q+1)-z^(q+2))/(1-2*z+z^(q+2));
w :=
proc(n)
option remember;
local gf, mxrun;
gf := 1;
for mxrun to n do
gf := gf +
coeftayl(G(mxrun)-G(mxrun-1), z=0, n)*z^mxrun;
od;
gf;
end;
X := n -> coeftayl(n*G(n)-add(G(q), q=0..n-1), z=0, n)/2^n;
Here are two examples.
> v(4);
4 3 2
z + 2 z + 5 z + 7 z + 1
> w(4);
4 3 2
z + 2 z + 5 z + 7 z + 1
Addendum. Responding to the question of the OP, the maximum run length distribution for $n=50$ is
> w(50);
50 49 48 47 46 45 44 43 42
z + 2 z + 5 z + 12 z + 28 z + 64 z + 144 z + 320 z + 704 z
41 40 39 38 37 36
+ 1536 z + 3328 z + 7168 z + 15360 z + 32768 z + 69632 z
35 34 33 32 31
+ 147456 z + 311296 z + 655360 z + 1376256 z + 2883584 z
30 29 28 27 26
+ 6029312 z + 12582912 z + 26214400 z + 54525952 z + 113246208 z
25 24 23 22
+ 234881024 z + 486539259 z + 1006632909 z + 2080374408 z
21 20 19 18
+ 4294964912 z + 8858356224 z + 18253535488 z + 37580568576 z
17 16 15 14
+ 77307408384 z + 158903894017 z + 326369607799 z + 669786836360 z
13 12 11
+ 1373319005440 z + 2812533538048 z + 5749650288420 z
10 9 8
+ 11716183298140 z + 23723022576779 z + 47402584528885 z
7 6 5
+ 92066138963408 z + 168050756947888 z + 267156803852044 z
4 3 2
+ 310228979841119 z + 174887581402185 z + 19394019617001 z
+ 32951280098 z + 1
I ask you for additional explanation. Meanwhile I'll post here another approach.
Denote by $\tau_i^5$ the random variable that counts the time required to get five heads starting from $i$ heads, ok?
What we want is exactly $E[\tau_0^5]$, right?
Now, you can evaluate $E[\tau_0^5]$ conditioning at the first step.
$$
E[\tau_0^5] = \frac{E[\tau_0^5]}{2} + \frac{E[\tau_1^5]}{2} +1
$$
Explaining the equation above. With probability $1/2$ you have a tail, so the time you will take to get five heads is the same, because you have any heads. On the other hand, with the same probability you get a head, and now, the number of flips needed to get 5 heads is $E[\tau_1^5]$, because now you that you have one head. The +1 it is because you have to count the first step.
Repeating the argument above you get
$$
E[\tau_1^5] = \frac{E[\tau_0^5]}{2} + \frac{E[\tau_2^5]}{2} +1
$$
Proceeding this way, and remembering $E[\tau_5^5]=0$, you get
$$
E[\tau_0^5] = 62
$$
This may seems more complicated at the first sight, but the idea of "to conditionate at what happens at the first time (or movement)" solve a big variety of problems.
Best Answer
Full answer, but a little sloppy, using Markov chains.
Consider the Markov chain with states $0,1,\dots,m$ with $0$ starting state, and transition matrix:
$$T=\begin{pmatrix}p&(1-p)&0&\cdots&0\\ p&0&1-p&\cdots&0\\ \vdots&\vdots&\vdots &\vdots&0\\p&0&0&\dots&1-p\\p&(1-p)&0&\cdots&0\end{pmatrix}$$ Then we want to know how often we are in state $m,$ which is when we got the heads a multiple of $m$ times in a chain of consecutive heads.
The expected value for how often we are in state $m$ is: The $$\begin{pmatrix}1&0&0&\dots&0\end{pmatrix}\left(\sum_{i=1}^{n} T^i\right)\begin{pmatrix}0\\0\\\vdots\\0\\1\end{pmatrix}$$ Is the expected number of hits of state $m,$ which is the number of non-overlapping sequences of $m$ heads.
When $m=2$ Wolfram Alpha gives me:
$$\begin{pmatrix}1&0&0\end{pmatrix} T^i\begin{pmatrix}0\\0\\1\end{pmatrix}=\frac{(1-p)^2}{2-p}\left(1-(p-1)^{i-1}\right)$$
So the sum is $$n\frac{(1-p)^2}{2-p}-\frac{(1-p)^2}{(2-p)^2}\left(1-(p-1)^n\right)$$
The limit of $E(X_2)/n$ then is $\frac{(1-p)^2}{2-p}.$
When $m=3,$ I think the limit is:
$$\frac{(1-p)^3}{p^2-3p+3}$$
Conjecture
It is possible that the limit is:
$$\frac{(1-p)^m}{1+(1-p)+\cdots+(1-p)^{m-1}}=\frac{p(1-p)^m}{1-(1-p)^m}=\frac{p}{(1-p)^{-m}-1}$$
As $p\to 0$ we'd expect a limit of $\frac1m,$ and we do get that, for $f(x)=(1-x)^{-m},$ $$\lim_{p\to 0} \frac{p}{f(p)-f(0)}=\frac{1}{f'(0)}=\frac{1}{m}.$$
I think the characteristic polynomial of $T$ is:
$$\lambda(\lambda-1)\frac{\lambda^{m}-(1-p)^m}{\lambda -(1-p)},$$ just from looking at $m=1,2,3,4.$
Attempt to understand $T$
If we write $T=pA_0+(1-p)B$ with $$A_0=\begin{pmatrix}1\\1\\\vdots\\1\end{pmatrix}\begin{pmatrix}1&0&0&\cdots&0\end{pmatrix}\\B=\begin{pmatrix}0&1&0&\cdots&0\\0&0&1&\cdots&0\\ \vdots&\vdots&\vdots &\ddots&0\\ 0&0&0&\cdots&1\\ 0&1&0&\cdots&0 \end{pmatrix}$$
Then $$A_0B=A_1=\begin{pmatrix}1\\1\\\vdots\\1\end{pmatrix}\begin{pmatrix}0&1&0&\cdots&0\end{pmatrix}$$
For $i=0,\dots,m-1,$ $A_iB=A_{i+1}$ and $A_mB=A_1.$
Also, $BA_i=A_i.$
Also: $A_iA_j=A_j.$
So $$T^k=(pA_0+(1-p)B)^k=(1-p)^kB^k+\sum_{i=0}^{m} c(k,i)A_k$$ for some real $c(k,i).$ You get:
$$\begin{align} T^{k+1}&=(pA_0+(1-p)B)T^k\\&=p(1-p)^{k}A_0B^{k}+p\sum_i c(k,i)A_i+(1-p)^{k+1}B^{k+1}+(1-p)\sum_{i}c(k,i)A_i\\&=(1-p)^{k+1}B^{k+1}+p(1-p)^kA_0B^k+\sum_i c(k,i)A_i \end{align}$$
So since $A_0B^k=A_i$ where $i=1,2,\dots,m$ has $i\equiv k\pmod m,$ you get:
$$c(k+1,i)=\begin{cases}c(k,i)+p(1-p)^k&i\neq0, i\equiv k\pmod m\\c(k,i)&\text{otherwise}\end{cases}$$
The. $$(1,0,0,\dots,0)T^k(0,0,\dots,1)^T=(1-p)^{k}(1,0,\dots,0)B^k(0,0,\dots,1)^T+c(k,m)$$
You start with $c(1,0)=p,c(1,i)=0.$ And $$(1,0,\dots,0)B^k(0,0,\dots,1)^T=\begin{cases}1&m\mid k\\0&\text{otherwise}\end{cases}$$
So the expected value is:
$$\sum_{j=1}^{\lfloor n/m\rfloor}(1-p)^{mj}+m\left(\sum_{j=0}^{\lfloor n/m\rfloor-1}c(mj+1,m)\right)+(n\bmod m)c(n,m)$$
And $$c(mj+1,m)=\sum_{i=1}^j p(1-p)^{mi}=\frac{p(1-p)^{m}\left(1-(1-p)^{mj}\right)}{1-(1-p)^m}$$
You get $$X_{m,n}=n\frac{p(1-p)^m}{1-(1-p)^m}+O(1).\tag1$$
From this we get my conjecture, with $q=1-p$ the probability of a heads:
$$\frac{X_{m,n}}{n}\to\frac{1-q}{q^{-m}-1}$$
I think there is constant $C,$ depending on $p,m,$ such that we get more specifically:
$$X_{m,n}=n\frac{p(1-p)^m}{1-(1-p)^m}+C+o(1).$$
My brain is exausted, so I’m sure I’d get wrong if I computed $C.$ You can probably replace $o(1)$ with the more specific $O\left((1-p)^n\right).$