[Math] Identify inherently ambiguous languages

automatacomputer sciencecontext-free-grammardiscrete mathematicsformal-languages

Which of the following languages is/are inherently ambiguous languages?

$L_1=\{a^nb^nc^m|m,n\geq0\}\cup\{a^nc^c|n\geq0\}$

$L_2=\{a^nb^nc^m|m,n\geq0\}\cup\{c^mb^na^n|m,n\geq0\}$


My attempt:

A language that only admits ambiguous grammars is called an inherently ambiguous language, and there are inherently ambiguous context-free languages.

Well known inherently ambiguous languages are :

$M_1=\{a^nb^nc^md^m|m,n\geq0\}\cup\{a^nb^mc^md^n|m,n\geq0\}$

$M_2=\{0^i1^j2^k|i=j\space or \space j=k\}$

$M_3= \{a^n b^m c^p: n\neq m\} \cup \{a^n b^m c^p: m\neq p\}$

$L_2$ and $M_3$ seems matching but second part is different.

Can you explain inherently ambiguous languages, please?

Best Answer

When you have two context-free languages and grammars generating them, there is an obvious way to create a grammar that generates their union. If the given grammars are non-ambiguous and the languages are disjoint, the new grammar will also be non-ambiguous.

All your languages are given as unions (the or-condition for $M_2$ is essentially the same as a union) of (more or less obviously) non-ambiguous languages that that are not disjoint. Let's look at their intersections.

In the case of the well known ambiguous languages all the intersections are languages that are not context-free, for example in the case of $M_1$ the intersection consists of the words of the form $a^nb^nc^nd^n$. Of course, that alone doesn't prove that the language is ambiguous, but it shows that it can't be easy to construct a non-ambiguous grammar from grammars for the two sublanguages.

That is not the case for your languages $L_1$ and $L_2$. (There must be a typo in the definition of $L_1$ when there is $c^c$ in the second part of the union, but it doesn't seem ambiguous to me however that is fixed. Anyway, let's better take $L_2$ as example.) The intersection of the sublanguages of $L_2$ contains all the words $c^m$. It is easy to change the definition of one of the sublanguages to no longer contain these words, for example, change the condition $n\geq0$ to $n\geq1$ in the first language. Now you have the same language as a disjoint union of non-abiguous languages, so it is non-ambiguous itself.

You might want to prove that it is really the same language, and that there really is a non-ambiguous grammar, but this is the essential idea.

I hope you can see now that a match in one part of the union is completely irrelevant. Important is how the sublanguages interact, in this case, how they intersect.