[Math] A CFG for the language $\{ O^i 1^j 2^k 3^l : i=k, j>1 , l>0 \}$: Is the approach correct

context-free-grammarformal-languages

I am trying to construct a CFG for the CFL $\{ O^i 1^j 2^k 3^l : i=k, j>1 , l>0 \}$.
When given $i=2$, $k=2$, $j=2$, and $l=1$, this yields the string $0011223$.

My approach:
$$\begin{align}
S &\Rightarrow A \\
A &\Rightarrow OAB \mid \varepsilon \\
B &\Rightarrow 1B1C \mid \varepsilon \\
C &\Rightarrow 2CD \mid \varepsilon \\
D &\Rightarrow 3D \mid \varepsilon
\end{align}$$

Please explain to me the thought-process associated when solving such problems, I keep on getting similar answers for all questions, and don't know how to properly tackle them. 🙁

Best Answer

Since the number of $\newcommand{\Ze}{\mathtt{0}}\newcommand{\On}{\mathtt{1}}\newcommand{\Tw}{\mathtt{2}}\newcommand{\Th}{\mathtt{3}}\Ze$s must equal the number of $\Tw$s, a good idea is to start basing your CFG around the usual CFG for the language $\{ \Ze^n \Tw^n : n \geq 0 \}$, namely $$S \rightarrow \Ze S \Tw \mid \varepsilon.$$ We see for the language in question that there are at least two $\On$s in between the $\Ze$s and $\Tw$s, and so we can alter the above rule as follows: $$\begin{align*} S &\rightarrow \Ze S \Tw \mid T \\ T &\rightarrow \On T \mid \On \On \end{align*}$$ Next we have to worry about the $\Th$s. Since we only need a positive number of them at the end of the string with no relationship necessary to any of the other symbols, we can introduce a new start symbol that will just split off into the part of the grammar we have introduced so far, and a part that will generate the $\Th$s: $$\begin{align} S_0 &\rightarrow S U \\ S &\rightarrow \Ze S \Tw \mid T \\ T &\rightarrow \On T \mid \On \On \\ U &\rightarrow \Th U \mid \Th \end{align}$$ And we're done.


Among other strings, your grammar will generate the following:

  1. $\varepsilon$ (generated by $S \Rightarrow A \Rightarrow \varepsilon$);
  2. $\Ze \Ze \Ze$ (generated by $S \Rightarrow A \Rightarrow \Ze A B \Rightarrow \Ze \Ze A B B \Rightarrow \Ze \Ze \Ze A B B B \Rightarrow \Ze \Ze \Ze \varepsilon \varepsilon \varepsilon \varepsilon = \Ze \Ze \Ze$);

which are not in the language. Upon inspection, I think your grammar generates the language $$\{ \Ze^m ( \On^n ( \On \Tw^* \Th^* )^n )^m : n , m \geq 0 \}$$

Related Question