[Math] Number of bit strings that contain 01

discrete mathematicsrecurrence-relations

I understand this question has been answered before, but I'm looking for feedback on why my approach is incorrect. Consider the following:

How many bit strings of length n can be generated that contain 01? Write a recursive definition.

I started with $a_2 = 1$ because a length 2 bit string only has 1 string that contains 01… that string is simply 01. Then I noticed you could either tag a 1/0 on the beginning of it or a 1/0 on the end of it to get strings of length 3 that contain 01. This leaves you with 4 new bit strings of length 3. Then, each of these 4 you can either append a 1/0 to the beginning or 1/0 to the end. Resulting in 4 more for each of the 4 previously generated. This led me to believe the recurrence relation was:

$a_n = 4_{a_n-1}$. Because for each new string you generate that contains 01, you can generate 4 more. However the solution is:

$a_n = a_{n-1} + 2^{n-1} – 1$

Could someone please help me see where my reasoning is flawed?

Best Answer

Your basic idea can be made to work if you add bits only at one end instead of adding them at both ends. Say that a bit string is good if it contains the substring $01$ and bad otherwise. Suppose that $\sigma$ is a string of length $n-1$.

  • If $\sigma$ is good, we can append either a $0$ or a $1$ to get a good string of length $n$; this accounts for $2a_{n-1}$ good strings of length $n$.

  • If $\sigma$ is bad, the only way to get a good string by appending one more bit is for $\sigma$ to end in $0$ and to append a $1$; this gives us one good string for every bad string of length $n-1$ that ends in $0$. There are $2^{n-1}$ strings of length $n-1$, and $a_{n-1}$ of them are good, so $2^{n-1}-a_{n-1}$ of them are bad. Only one bad string of length $n-1$ ends in $1$, the string $$\underbrace{111\ldots111}_{n-1\text{ bits}}\;,$$ so there are $2^{n-1}-a_{n-1}-1$ bad strings of length $n-1$ that can be extended to a good string of length $n$.

The total number of good strings of length $n$ is therefore

$$a_n=2a_{n-1}+\left(2^{n-1}-a_{n-1}-1\right)=a_{n-1}+2^{n-1}-1\;,\tag{1}$$

exactly the recurrence in the solution.

You can actually check this rather easily by working out a closed form for $a_n$. The bad strings of length $n$ are precisely the ones in which all of the zeroes, if any, are after all of the ones, if any. If there are $0$ ones, we get the string of $n$ zeroes. If there is $1$ one, we get a $1$ followed by $n-1$ zeroes. And so on: if there are $k$ ones, we get a block of $k$ ones followed by a block of $n-k$ zeroes. There is just one such string for each $k$ from $0$ through $n$, so there are $n+1$ bad strings of length $n$. It follows that

$$a_n=2^n-(n+1)=2^n-n-1\;,\tag{2}$$

and it’s not hard to show by induction that $(2)$ satisfies the recurrence $(1)$:

$$\left(2^{n-1}-(n-1)-1\right)+2^{n-1}-1=2\cdot2^{n-1}-n-1=2^n-n-1\;.$$