[Math] context free grammar to deterministic finite-state automaton

automatacontext-free-grammar

I have context free grammar G=(V={S,A}, ∑={a,b}, R, S) with R:

$\qquad\begin{align}
S &\to aT \mid Sb \mid a \\
T &\to aS \mid Tb \\
\end{align}$

Which I tried to convert to a deterministic finite-state automaton:M=(Q={S,T}, ∑={a,b}, 𝛿, q=S, F={S}. So I came up with the following DFA:

Grammar --> DFA

With the idea that S --> aT in G is equal to 𝛿(S,a)=T in M. However my Automaton seems to be wrong, because I can make a parsing tree from aaabbb, but in my automaton it ends up in T, which means it is not accepted. So I guess the following rule is NOT correct: S --> Sb equal to 𝛿(S,b)=S.

How do I construct a correct DFA for the context free grammar G?

Edit 1: With the hint of @Brian M. Scott I got the following DFA:

Adjusted G --> DFA

So that it only accepts uneven a at the beginning and b>=0 at the end. Also no b between aa or a between bb is accepted (hence garbage node).

Edit 2: Added a, b transitions to the garbage State.

Best Answer

HINT: Show that $G$ generates the language $\{a^{2m+1}b^n:m,n\ge 0\}$, and design a corresponding automaton. You're almost there: you have the wrong acceptor state, and you need to modify the $b$-transition at $S$ to handle correctly inputs that start with $b$ and inputs that have an $a$ after an acceptable input.

Added: By request I’m adding an outline of the analysis showing that $G$ generates the language $L=\{a^{2m+1}b^n:m,n\ge 0\}$. The first thing to notice is that in any terminal derivation the last step must apply the production $S\to a$: that’s the only one that does not produce a non-terminal symbol. Next, the other productions do not change the number of non-terminal symbols; since we start with one, we always have exactly one until we apply that final $S\to a$. The third important observation is that every $a$ except the last one produced is produced to the left of the non-terminal, while every $b$ is produced to its right. The first and third observations together show that every word generated by $G$ has the form $a^mb^n$ for some $m\ge 1$ and $n\ge 0$.

We next observe that there is no restriction on $n$: the productions $S\to Sb$ and $T\to Tb$ allow us to pump out any number of $b$s. However, we can only apply that final $S\to a$ if we have the non-terminal $S$, which means that for each application of $S\to aT$ we must have an application of $T\to aS$ to get back to $S$. That is, all of the $a$s in the final word except the one generated by $S\to a$ come in pairs, so altogether we must end up with an odd number of $a$s. We can have any number of pairs, including $0$, so the possible values of $m$ are the odd integers, the numbers $2k+1$ for $k\ge 0$.

Related Question