Regular expression for all strings with even length and not containing “ab” and “ba”

computer scienceregular expressionsregular-language

I have a problem with this question:

we have a language with alphabet {a, b, c}, all strings in this language have even length and does not contain any substring "ab" and "ba" for example these strings acceptable: "accb", "aa", "bb", "bcac", and these strings not acceptable "ccab", "abca", "cabc" and so on.. actually I think this is not regular language but I can't prove it, anyone can help me by give a regular expression for this language or prove that this language is not regular? or any help that I can figure out how I must think about it.

i tryed (aa)(cc)(bb)+(bb)(cc)(aa)+(ac+ca)+(bc+cb) and ((a+c)(a+c))+((b+c)(b+c))+(ac+bc)(ca+cb)+(ca+cb)(ac+bc) but this not work

Best Answer

We need $7$ non-terminals. $S$ is the initial state, $T,\ U,$ and $V$ all indicate that an odd number of characters have been read, and that the last character read was $a$, $b$, or $c$, respectively. Similarly, $X,\ Y$, and $Z$ indicate that an even number of characters have been read, and that the last character read was $a$, $b$, or $c$, respectively. I use $\lambda$ for the empty string. $S,\ X,\ Y$, and $Z$ are accepting states, so we have the productions $$S\to\lambda\\ X\to\lambda\\ Y\to\lambda\\ Z\to\lambda\\ $$ Also we, have $$S\to aT\\S\to bU\\S\to cV\\T\to aX\\T\to cZ\\U\to bY\\U\to cZ\\V\to aX\\V\to bY\\V\to cZ\\$$

I trust you see how I arrived at these. If we get a $b$ in state $T$ or an $a$ in state $U$ then we've seen a forbidden substring, so there are no productions corresponding to these possibilities.

It remains to determine the productions with $X,\ Y$ or $z$ on the left-hand side. I leave it to you to supply those.

EDIT Reading this over, I see that I've slipped into the language of automata at times. I hope it's obvious what I mean. If not, ping me.

EDIT

When I wrote this, I though you were mainly concerned with whether or not the language is regular, but it's apparent from your comments that you're mostly concerned with getting a regex. Remember that the complement of a regular language is regular. So we want the complement of the union of

-set of strings containing $ab$
-set of strings containing $ba$
-set of strings with an odd number of characters

So, if you can make regexes for these three languages, you can combine them to construct the regex you seek.

EDIT

The previous edit is wrong. The discussion prior to theorem $4.5$ in Hopcroft, Motwani and Ullman says that you can't do it that way. You have to convert the regex to a DFA, construct a DFA that recognizes the complement, and then convert the new DFA to a regex. I think any regex for this language will be exceedingly long and complicated.