[Math] Deciding equivalence of regular languages

automata-theorycomputability-theorycomputer scienceformal-languageslo.logic

Given two regular expressions $R$ and $S$ on an alphabet $\Sigma$ it is possible to decide their equivalence as follows:

  1. build two finite automata $M_R$ and $M_S$ such that $L(R) = L(M_R)$ and $L(S) = L(M_S)$
  2. build an automaton $M$ such that $L(M) = (L(M_R) – L(M_S)) \cup (L(M_S) – L(M_R))$
  3. test emptyness of $L(M)$ using a reachability algorithm on $M$

I was wondering if there is another way to decide equivalence. Suppose $M_R$ and $M_S$ are the minimal DFA (without epsilon-moves) such that $L(R) = L(M_R)$ and $L(S) = L(M_S)$. If they have a different number of states, then $R$ and $S$ are not equivalent. Otherwise let $m$ be the number of states of the two automata. Is it true that $L(M_R) = L(M_S)$ iff ${x \in L(M_R) : |x| \leq m +1 } = {x \in L(M_S) : |x| \leq m +1 }$?
How to prove that with the Myhill-Nerode theorem?

Best Answer

If $M_R$ and $M_S$ have $m,n$ states respectively, then one has that $L(M_R)=L(M_S)$ iff they have the same words of length at most $mn-1$. This is essentially the content of your algorithm. Suppose that they are different and let $w$ be a minimal length word accepted by one of the machines and not the other. If $w$ has length greater than $mn-1$, then when you run $w$ from the initial state of $M_R\times M_S$, you will get a loop. This loop will give you a factorization $w=xuy$ where $u$ reads a loop in both $M_R$ and $M_S$. So then $xy$ will be accepted in one of the machines and not the other and have smaller length.

I don't think your proposed bound would work but I have to think a little to get an example.

Added. I believe this is a counter example. Let m be an integer. Consider over a unary alphabet the languages $R=\lbrace a^n\mid n\not\equiv m-2 \pmod m\rbrace$ and $S=\lbrace 1,a,..,a^{m-3}\rbrace\cup \lbrace a^n\mid n\geq m-1\rbrace$. Then both of these are recognized by an $m$-state automaton (I believe both are minimal) and the shortest word in one, but not the other, is $a^{2m-2}$, which has length $2m-2$. I hope this works.

Related Question