[Math] Find context-free grammar from language expression

context-free-grammarregular expressionsregular-language

I'm not even sure where to begin with these questions. How would I go about these step-by-step, finding CFGs for the below languages?

$$\begin{align}a^ib^j &| i \neq j-1 \\
a^ib^jc^k &| i = j \text{ or } j \neq k \\
a^ib^jc^k &| k = i + j \\
a^iww^Rb^i &| i > 0, w \in \{a, b\}^* \\
a^ib^jc^k &| i+2j=k\end{align}$$

I know it is in the form of:

$$ S \to \text{…} \\
T \to \text{…} \\
Q \to \text{…} $$

Where $S$ is an initial symbol, and $T$ and $Q$ are potential productions, but not much else. Something like:

$$S \to aSa | \varepsilon$$

Means a string in the language contains an even number of a. Such as aa, aaaa, aaaaaa, etc? I'm not sure how to convert an expression to a CFG.

Best Answer

I'm not going to do them all for you, as I think these are valuable exercises. I will do the first and the third.

For the first one, I would split it into two cases: $i < j - 1$ or $i > j - 1$. So, my first two rules would be, \begin{align*} S &\to L \\ S &\to M \end{align*} standing for "Less" and "More" respectively. For the "Less" case, note that this is equivalent to $j \ge i + 2$. We therefore need there to be at least two $b$s. Then, we want a number of $a$s, which are simultaneously matched by additional $b$s, and finally, we can finish by adding as many extra $b$s as we want. The following rules capture this: \begin{align*} L &\to L'bb & \text{Add the mandatory two $b$s} \\ L' &\to aL'b & \text{Add as many $a$s as extra $b$s} \\ L' &\to T & \text{Once you've finished adding $a$s, move on}\\ T &\to Tb & \text{Add more $b$s as desired, but no more $a$s} \\ T &\to \varepsilon & \text{Terminate} \end{align*} Similarly, for the "More" case, this boils down to $i \ge j$. We get a similar setup: \begin{align*} M &\to aMb & \text{Add as many $a$s as $b$s} \\ M &\to V & \text{Once you've finished adding $a$s and $b$s, move on}\\ V &\to aV & \text{Add more $a$s as desired, but no more $b$s} \\ V &\to \varepsilon & \text{Terminate} \end{align*} In total, we can even optimise it a bit by removing the $L'$ state. See if you can see how to do that.


For the second one, we can start by adding as many $a$s as we want, so long as we match it with $c$s. Then we can add as many $b$s as we want, again so long as we match it with $c$s. The grammar for this looks like this: \begin{align*} S &\to aSc & \text{Add in $a$s matched by the same number of $c$s} \\ S &\to T & \text{Move on when ready} \\ T &\to bTc & \text{Add in $b$s matched by the same number of $c$s}\\ T &\to \varepsilon & \text{Terminate} \end{align*}

Related Question