[Math] Finding a grammar for given language

computer sciencecontext-free-grammardiscrete mathematicsformal-languages

So for this problem we are given a language and we have to find the grammar for that set. I am confused and what the constructors should be.

The language in this problem is: $\{bb, bab, baab, baaab,…\}$ = $\{ba^nb | n \in \mathbb{N} \}$

For some, I know that one of the constructors involves the empty string, like $S \rightarrow \Lambda$, but that is usually when the empty string is included in the language (at least in the examples the professor showed us). So like $\{ \Lambda, bb, bab,…\}$. I looked in the back of the book and some of the other problems didn't have an empty string as one of their constructors.

I was thinking it's something like:

$S \rightarrow \Lambda$ | $bSb$ | $aS$

So that: $S => bSb => b\Lambda b => bb$ or $S => bSb => baSb => baaSb => baa\Lambda b => baab$

This gives me the language I am looking for somewhat, but what if $S => \Lambda$ right from the start? Then that messes me up since $\Lambda$ is not included in the language. Any idea on how to fix this?

Best Answer

Your suggested grammar has an additional problem besides generating the empty string: it generates $bbbb,bbbbbb,bbabb$, and infinitely many other words that you don’t want.

Note that the language is regular: it corresponds to the regular expression $ba^*b$. Thus, we should be able to come up with a regular grammar for it. Clearly the first terminal symbol that this grammar generates should be a $b$, and after that we want to be able to generate an arbitrary string of $a$s. That suggests a production $S\to bA$, where the non-terminal $A$ will be capable of generating a string of $a$s followed by a single $b$. Clearly we need $A\to b$ in order to be able to end a string correctly. The grammar needs just one more production in order to generate the right language; can you see what it is?