[Math] Counting binary strings that have atmost k consecutive 0’s

binarybinomial-coefficientscombinatorics

I know how to count how many binary strings with length n and having exactly k 0's are there but i am not able to find a way to count the number of binary strings that have exactly x 0's and y 1's and uses at most k1 consecutive 0's and k2 consecutive 1's.

Here is how I approach I can enumurate through all possible binary string that uses exactly x 0's and y 1's and subtract from those that have more than k1 consecutive 0's and k2 consecutive 1's but how do I find binary strings that have more than k1 consecutive 0's and k2 consecutive 1's.

Any other approach to solve the same problem.

Best Answer

Note: You could model the situation with generating functions. We start with some preliminary considerations, introducing a symbolic method for combinatorial structures from which we can easily derive corresponding generating functions.

Problem: Let's consider binary strings containing $0s$ and $1s$. We are looking for the number of strings with length $n$, having at most $k_1$ consecutive $0$s and $k_2$ consecutive $1$s. We want to count the strings with $x$ $0$s and $y$ $1$s.

We build them starting with simpler patterns. Let $$\text{SEQ}(1)=\{\varepsilon,1,11,111,\ldots\}$$ denote all strings containing only $1$s with length $\geq0$. The empty string is denoted with $\varepsilon$. The corresponding generating function is $$w^0+w^1+w^2+\ldots=\frac{1}{1-w},$$ with the exponent of $w^n$ marking the length $n$ of a string of $1s$ and the coefficient of $w^n$ marking the number of strings of length $n$. Similarly, we define $\text{SEQ}(0)=\{\varepsilon,0,00,000,\ldots\}$.

Let $\text{SEQ}^{\leq k}(1)$ be the set of all strings of $1$s with length $\leq k$. The generating function for $\text{SEQ}^{\leq k}(1)$ is $$1+w+w^2+\ldots+w^{k}=\frac{1-w^{k+1}}{1-w}$$

Simplified problem: First we consider the simplified variant: $$k_1=k_2=1,$$ namely strings having runs of $0$s and $1$s of maximum length $1$.

How could we describe all these strings? We may have a leading $1$. So, we start with zero or one $1$s. Formally: $$\text{SEQ}^{\leq 1}(1)=\{\varepsilon,1\}$$ with generating function $$1+w$$

Then we may have zero or more groups, each group containing a $0$ followed by a $1$. Formally:

$$\text{SEQ}(01)=\{\varepsilon,01,0101,010101,\ldots\}$$

with generating function $$1+uw+(uw)^2\ldots=\frac{1}{1-uw}$$ Observe, that since we want to count $x$ $0$s and $y$ $1$s separately, we encode them using a bivariate generating function with $u$ indicating the amount of $0$s and $w$ indicating the amount of $1$s. The strings may finish with zero or one $0$, formally $\text{SEQ}^{\leq 1}(0)$ with generating function $1+u$.

We conclude: The binary strings having runs of $0$s and $1$s of maximum length $1$ are modelled by

\begin{align*} \text{SEQ}^{\leq 1}(1)\text{SEQ}(01)\text{SEQ}^{\leq 1}(0)\tag{1} \end{align*}

with generating function

\begin{align*} \frac{(1+u)(1+w)}{1-uw}\tag{2} \end{align*}


General Problem: Now we consider the symbolic expression for the general problem. Since we have to consider runs of at most $k_1$ $0$s and runs of at most $k_2$ $1s$ we have to substitute the outer occurrences $0$ and $1$ by up to $k_1$ $0$s resp. $k_2$ $1$s and the inner $0$s and $1$s enlarge by up to $k-1$ $0$s resp. $1$s. So, we get

\begin{align*} \text{SEQ}^{\leq k_2}(1)\text{SEQ}\left(0\text{SEQ}^{\leq k_1-1}(0)1\text{SEQ}^{\leq k_2-1}(1)\right)\text{SEQ}^{\leq k_1}(0)\tag{3} \end{align*}

with the corresponding bivariate generating function $G(u,w)$

\begin{align*} G(u,w)=\frac{\left(\frac{1-u^{k_1+1}}{1-u}\right)\left(\frac{1-w^{k_2+1}}{1-w}\right)} {1-\left(u\frac{1-u^{k_1}}{1-u}\right)\left(w\frac{1-w^{k_2}}{1-w}\right)}\tag{4} \end{align*}

In order to simplify calculations we introduce:

\begin{align*} H_k(z)=\frac{1-z^k}{1-z} \end{align*}

We use the coefficient of operator $[z^n]$ to denote the coefficient of $z^n$ of the corresponding generating function and so we can conclude:

Result: The number of $x$ $0$s and $y$ $1$s of strings of length $n=x+y$ containing runs of at most $k_1$ $0$s and at most $k_2$ $1$s is the coefficient of $[u^xw^y]$ in $G(u,w)$ with

\begin{align*} [u^xw^y]G(u,w)&=[u^xw^y]\frac{H_{k_1+1}(u)H_{k_2+1}(w)}{1-uwH_{k_1}(u)H_{k_2}(w)}\tag{5}\\ &=[u^xw^y]{H_{k_1+1}(u)H_{k_2+1}(w)}\\ &\qquad\qquad\cdot\sum_{n\geq 0}(uw)^n\left(H_{k_1}(u)\right)^n\left(H_{k_2}(w)\right)^n \end{align*}

The coefficients $[u^x]$ and $[w^y]$ can be calculated separately due to the (relatively) nice structure of $G(u,w)$. We find

\begin{align*} [u^x]u^n\left(H_{k_1}(u)\right)^n&=[u^x]u^n\left(\frac{1-u^{k_1}}{1-u}\right)^n\\ &=[u^{x-n}]\sum_{j=0}^{n}\binom{n}{j}(-u)^{k_1j}\sum_{l\geq 0}\binom{-n}{l}(-u)^l\\ &=[u^{x-n}]\sum_{j=0}^{n}\binom{n}{j}(-u)^{k_1j}\sum_{l\geq 0}\binom{n+l-1}{l}u^l\\ &=\sum_{l\geq 0}\binom{n+l-1}{l}[u^{x-n-l}]\sum_{j=0}^{n}\binom{n}{j}(-u)^{k_1j}\\ &=\sum_{j=0}^{n}(-1)^{k_1j}\binom{n}{j}\binom{n+(x-n-k_1j)-1}{x-n-k_1j}\\ &=\sum_{j=0}^{n}(-1)^{k_1j}\binom{n}{j}\binom{x-k_1j-1}{n-1} \end{align*}

Similar calculations can be used to find the other parts of (5).


Simplification: If we consider the simplified problem of counting the number of runs of length at most $k$ in a binary alphabet, we can use a univariate generating function based upon (3) and (4).

  • The symbolic equation (3) simplifies to

\begin{align*} \text{SEQ}^{\leq k}(1)\text{SEQ}\left(0\text{SEQ}^{\leq k-1}(0)1\text{SEQ}^{\leq k-1}(1)\right)\text{SEQ}^{\leq k}(0) \end{align*}

  • and instead of $G(u,w)$ in (4) we can write:

\begin{align*} \widetilde{G}(u)&=\frac{\left(\frac{1-u^{k+1}}{1-u}\right)^2}{1-\left(u\frac{1-u^{k}}{1-u}\right)^2}\\ &=\frac{1-u^{k+1}}{1-2u+u^{k+1}}\\ \end{align*}

Reference: The generating function $\widetilde{G}(u)$ is Example 1.10 in Analytic Combinatorics the classic from P. Flajolet and R. Sedgewick. The symbolic method is described there in section I.1 and I.2.