Combinatorics – Find a Generating Function for the Number of Strings

combinatoricsgenerating-functions

The string $AAABBAAABB$ is a string of ten letters, each of which is $A$ or $B$, that does include the consecutive letters $ABBA$. Determine, with justification, the total number of strings of ten letters, each of which is $A$ or $B$, that do not include the consecutive letters $ABBA$.

The conventional way of using casework is too simply and gives the right answer indeed but I am interested in getting the generating function.

The generating function is

$$F(x) = \left(1 – 2x + \frac{x^4}{1 + x^3}\right)^{-1}$$

which the coefficient of $x^{10}$ does give the right answer, but I am not sure how to get there.

How do I start?

Best Answer

This answer is based upon the Goulden-Jackson Cluster Method which is a convenient method to derive a generating function for problems of this kind.

We consider the set of words in $ \mathcal{V}^{\star}$ of length $n\geq 0$ built from an alphabet $$\mathcal{V}=\{A,B\}$$ and the set $\mathcal{B}=\{ABBA\}$ of bad words, which are not allowed to be part of the words we are looking for.

We derive a function $F(x)$ with the coefficient of $x^n$ being the number of wanted words of length $n$ from the alphabet $\mathcal{V}$. According to the paper (p.7) the generating function $F(x)$ is \begin{align*} F(x)=\frac{1}{1-dx-\text{weight}(\mathcal{C})} \end{align*} with $d=|\mathcal{V}|=2$, the size of the alphabet and with the weight-numerator $\mathcal{C}$ with \begin{align*} \text{weight}(\mathcal{C})=\text{weight}(\mathcal{C}[ABBA]) \end{align*}

We calculate according to the paper \begin{align*} \text{weight}(\mathcal{C}[ABBA])&=-x^4-\text{weight}(\mathcal{C}[ABBA])x^3 \end{align*} and get \begin{align*} \text{weight}(\mathcal{C}[ABBA])=-\frac{x^4}{1+x^3} \end{align*}

It follows \begin{align*} F(x)&=\frac{1}{1-dx-\text{weight}(\mathcal{C})}\\ &=\left(1-2x+\frac{x^4}{1+x^3}\right)^{-1}\\ &=\frac{1+x^3}{1-2x+x^3-x^4}\\ &=1+2x+4x^2+8x^3+15x^4+28x^5+52x^6\\ &\qquad+97x^7+181x^8+338x^9+\color{blue}{631}x^{10}+O(x^{11}) \end{align*}

$$ $$

Addon 2016-05-22: Due to a comment of OP some details regarding the calculation of \begin{align*} \text{weight}(\mathcal{C}[ABBA])=-\frac{x^4}{1+x^3}\tag{1} \end{align*}

The main theme is to properly cope with overlaps of bad words. In order to do so we introduce some terminology. Let $w=w_1w_2\ldots w_n$ be a word consisting of $n$ characters. We define \begin{align*} weight(w)&:=x^{length(w)}=x^n\\ HEAD(w)&:=\{w_1,w_1w_2,\ldots,w_1w_2\ldots w_{n-1}\}\\ TAIL(w)&:=\{w_n,w_{n-1}w_n,\ldots,w_nw_{n-1}\ldots w_2\}\\ OVERLAP(u,v)&:=HEAD(u)\cap TAIL(v) \end{align*}

Example: $w=ABBA$

Since we have to consider only one bad word $ABBA$ we look at an example with $w=u=v=ABBA$.

We obtain \begin{align*} weight(ABBA)&=x^{length(ABBA)}=x^4\\ HEAD(ABBA)&=\{A,AB,ABB\}\\ TAIL(ABBA)&=\{A,BA,BBA\}\\ OVERLAP(ABBA,ABBA)&:=HEAD(ABBA)\cap TAIL(ABBA)=\{A\}\tag{2} \end{align*}

If $x\in HEAD(v)$ we write \begin{align*} v&=xx^\prime\\ v/x&:=x^\prime \end{align*} and we define for convenience $u:v$ as sum over all overlaps of two words $u$ and $v$. We sum up the weighted left-over parts $v/x$ when chopping overlaps. \begin{align*} u:v:=\sum_{x\in OVERLAP(u,v)}weight(v/x) \end{align*}

Note that according to (2) $$OVERLAP(ABBA,ABBA)=\{A\}$$ So, the calculation of $u:v$ with $u=v=ABBA$ results in \begin{align*} ABBA:ABBA&=\sum_{x\in \{A\}}weight(ABBA/x)\\ &=weight(ABBA/A)\\ &=weight(BBA)\\ &=x^{length(BBA)}\\ &=x^3 \end{align*}

Finally we find at the beginning of page 9 of the paper an identity for $weight(\mathcal{C}[v])$ with $Comp(v)$ defined as the set of bad words $u\in \mathcal{B}$ for which $OVERLAP(u,v)$ is non-empty. \begin{align*} weight(\mathcal{C}[v])=-weight(v)-\sum_{u\in Comp(v)}(u:v)\cdot weight\left(\mathcal{C}[u]\right) \end{align*}

We obtain with $v=ABBA$ \begin{align*} weight(\mathcal{C}[ABBA])&=-weight(ABBA)-\sum_{u\in Comp(ABBA)}(u:ABBA)\cdot weight\left(\mathcal{C}[u]\right)\\ &=-x^{length(ABBA)}-\sum_{u\in \{ABBA\}}(u:ABBA)\cdot weight\left(\mathcal{C}[u]\right)\\ &=-x^{length(ABBA)}-(ABBA:ABBA)\cdot weight\left(\mathcal{C}[ABBA]\right)\\ &=-x^4-x^3\cdot weight\left(\mathcal{C}[ABBA]\right)\tag{3}\\ \end{align*} and the claim (1) follows.

Note: Although the notation looks somewhat complicated. With some routine you can skip nearly all of the calculations. We can deduce (3) quickly by observing that there is just one bad word $ABBA$ resulting in a term $x^4$ and one overlap \begin{align*} A&BBA\\ ABBA&\\ \end{align*} resulting in a term $x^{length(BBA)}=x^3$.

Best Answer

Related Solutions

[Math] Generating Functions of Binary Strings

How many n strings are there of letters of english alphabet in which there are no consecutive z’s

Related Question