Find the equivalence classes of a formal language complement

equivalence-relationsformal-languagesregular expressions

Let $L=\{\sum^*-(\{\epsilon, a,b\}\cup \{bba^i|i\ge 0\})\}$ be a language over $\sum=\{a,b,c\}$. Find the equivalence classes of relation $R_L$ which is defined as follows: $xR_Ly \iff \forall z\in \sum^*:xz\in L \iff yz \in L$.

We learned a theorem that $R_L=R_{L'}$ where $L'$ is the complement of $L$. So to find $R_L$ of $\{\sum^*-(\{\epsilon, a,b\}\cup \{bba^i|i\ge 0\})\}$ is equivalent to find $\{\epsilon, a,b\}\cup \{bba^i|i\ge 0\}$.

$\{\epsilon, a,b\}\cup \{bba^i|i\ge 0\}=\{\epsilon, a,b,bba^*\}$. So there're $4$ equivalence classes:
$$
S_1=\epsilon\\
S_2=a\\S_2=b\\S=bba^*
$$
.

However the solution to this problem mentions the fifth class:
$$
c\Sigma^*+a\Sigma^++b(a+c)\Sigma^*+bb\Sigma^*(b+c)\Sigma^*
$$

I don't understand what this class represents. It seems that it actually represents the words in $L'$. But if so why are we adding it as these words are not in $L$?

Best Answer

It is easy to show that the Nerode equivalence classes are the same for a language $L$ and its complement $L'$. $$ xR_Ly \iff \left(\forall z\in \sum^*:xz\in L \iff yz \in L\right) \\ xR_{L'}y \iff \left(\forall z\in \sum^*:xz\in L' \iff yz \in L'\right) $$ Since the strings in $L'$ are exactly the strings not in $L$, then $$ xz\in L \iff xz\not\in L' \\ \left(xz\in L \iff yz \in L\right) \iff \left(xz\not\in L' \iff yz \not\in L'\right) \iff \left(xz\in L' \iff yz \in L'\right) $$ Therfore, $$ \left(\forall z\in \sum^*:xz\in L \iff yz \in L\right) \iff \left(\forall z\in \sum^*:xz\in L' \iff yz \in L'\right) \\ xR_Ly \iff xR_{L'}y $$


You seem to have forgotten the equivalences classes that contains all strings that are not accepted. However, I would strongly caution you not to try to find the equivalence classes the way that you have.

One reasons is, there may be multiple equivalence classes of strings that are not accepted (for example, this question). These will not always be obvious and you may not be able to guess them from the definition.

Another reason is, the accepting equivalence classes that you read off may not be separate equivalence classes. For instance, what if we changed your language just a bit: the language generated by $\{\epsilon,a,b,bba^*,aba^*\}$. You might similarly claim that the accepting equivalence classes are $$ S_1=\{\epsilon\} \\ S_2=\{a\} \\ S_3=\{b\} \\ S_4=\{bba^*\} \\ S_5=\{aba^*\} \\ $$ However, this is not true. The strings $a$ and $b$ must be in the same equivalence class, since they cannot be seperated by any $z\in\Sigma^*$.

For these reasons, I would highly recommend that you find the minimal DFA for a language to find the equivalence classes. Your way happened to work (although you missed the last equivalence class), but for many other languages it will give you a very wrong answer.

Related Question