[Math] Are those regular expressions are correct

formal-languagesregular expressions

Let Σ = {a, b}. Construct a regular expression defining each of the following languages over Σ.

(a) The language of all words that do not begin with ba

Answer: (a + b + ab + aa) a* b*

(b) The language of all words in which the total number of b’s is divisible by 3 nomatter how they are distributed, such as bbabbaabab

Answer: (a* ba* ba* ba*)*

Best Answer

For the first problem, anything that starts with an a, or just b, or anything that starts with a b then not an a:

a(a + b)* + b + bb(a + b)*


In a systematic way:

If we expand all possible strings, making the first two symbols explicit, we have

a + b + aa(a + b)* + ab(a + b)* + ba(a + b)* + bb(a + b)*

From these, we discard those beginning with ba, leaving

a + b + aa(a + b)* + ab(a + b)* + bb(a + b)*

Optionally we can compress the three terms beginning in a by

a + aa(a + b)* + ab(a + b)* = a + a(a + b)(a + b)* = a(a + b)*

and a final grouping is possible

a(a + b)* + b + bb(a + b)* = (a + bb)(a + b)* + b

The solution in the OP isn't valid as it does allow ba (choice b in the parenthesis, then a* once).


For the second problem, let us consider all strings with multiples of three b's

(bbb)*

and let us insert abritrary strings of a wherever possible: we get

a* (a* b a* b a* b a*)* a*

This solution is valid. Anyway, it leaves multiples possibilities for the decomposition of the initial and final strings of a, because of sequences a*a*. These can be avoided by the mergings a*a* = a*, which give the unambiguous

a* (b a* b a* b a*)*

The solution in the OP isn't valid as is does not represent the strings without any b, though they should be accepted.

Related Question