[Math] Find a regular expression for binary strings to have odd non-empty blocks of 1s

regular expressionsregular-language

Write down a Regular Expression for the language $L$ consisting of all binary strings where every non-empty block of $1$s has odd length. (Notice that the empty string is in this language.)

My working out:

$$(111)^* 011$$

This way, there will always be an odd number of $1$s.

Best Answer

Let's look at your regular expression:

$$ (111)^* 011 \enspace. $$

It generates the strings $011$, $111011$, $111111011$, and so on. None of these words is such that every nonempty block of $1$s has odd length. Moreover, the total number of $1$s alternates between even and odd, contrary to your claim. Even if we had guaranteed blocks of $1$s of odd lengths, as in

$$ 1(11)^*01(11)^* \enspace, $$

our goal is not to describe some of the words in $L$, but all of them. This is not entirely trivial in this case; therefore let's take another, more systematic approach.

The strings in our language may start with any number of $0$s. So, we are going to use $0^*$ as a building block. A nonempty string of $1$s of odd length is given by $1(11)^*$. That's another building block for our regular expression. From $1(11)^*$ we get $1$, $111$, $11111$, and so on.

Finally, in between blocks of $1$s we can have any positive number of $0$s. That is, $00^*$. Let's now put these building blocks together:

$$ 0^* (1(11)^* 00^*)^* (1(11)^* + \epsilon) \enspace. $$

It's not the simplest regular expression out there, but we should be able to recognize the building blocks. We start with an optional block of $0$s. Then we have zero or more blocks of $1$s of odd lengths, each followed by one or more zeros. Finally, we have an optional block of $1$s (of odd length) not followed by $0$s to account for the fact that a word in $L$ may not have trailing $0$s.

This is not the only regular expression for $L$, as is often the case. If you have already seen Kleene's Theorem, you should know of the systematic procedure alluded to by John Hughes in his comment. For this $L$ it produces a regular expression of complexity comparable to the one above.