[Math] Regular expression for matching strings without both bba and abb

formal-languagesregular expressions

What is a regular expression that will match all words without both bba and abb? (If a word has only one of them, it's fine.)

I tried (a+ba)*(bb+b+^) + b*. I'd like to know if this is the right answer.

Best Answer

This is not correct - for example, you do not allow any strings starting with bba, while there are many valid ones (eg bba itself).

One possible way of achieving this is to construct a DFA, then convert that to a regular expression, but that is quite a complicated process. In this case, there is a simpler method just by considering the options.

If the word doesn't have two consecutive bs, then it is obviously valid. This means all bs, except possibly the first, are preceded by an a. There are a few ways to write this, eg:

b(a+ab)* + (a+ab)*

If the word does have consecutive bs, then it cannot have an a both before and after. In other words, there can only be one 'long' string of bs, and it must come at the very start or the very end. So it is either:

  • all bs:
b*
  • a string of bs, followed by a, followed by anything with no consecutive bs
b*a(b(a+ab)* + (a+ab)*)
  • the reverse of the above
(b(a+ab)* + (a+ab)*)ab*

So one complete answer is:

b(a+ab)* + (a+ab)* + b* + b*a(b(a+ab)* + (a+ab)*) + (b(a+ab)* + (a+ab)*)ab*

There of course may be nicer and shorter answers!

Related Question