[Math] Understanding Regular Expression (0 U 1)*

automataregular expressionsregular-language

This question concerns regular expressions as they appear within Automata Theory.

I am experiencing confusion with some of the notation, relating to the union operation and the way it interacts with the Kleene closure.

More specifically I am confused about what it means when we say (0 U 1)*

Can this be interpreted as equivalent to (0* U 1*) ?

I believe both of these expressions can be expressed in English as "the set of all strings of any length consisting exclusively of 0s or 1s, including the empty string".

Or, in set notation {epsilon, 0, 00, 000, 1, 11, 111…}

I would greatly appreciate someone helping me clarify this!!

Best Answer

Welcome to MSE! Here $(0 \cup 1)^*$, often written as $(0 | 1)^*$, is actually the set of all bitstrings. Let's break it down. Inside of the parentheses is the expression $0 | 1$, meaning anything that that is a zero or a one is allowed (i.e. everything is allowed). The asterisk for Kleene closure tells you that you can repeat it as many times as you wish. So in a sentence, $(0 | 1)^*$ represents the set of strings where each bit is a zero or one, and you can have as many such bits as possible. Hopefully it's now clear why this is the set of all bitstrings.

On the other hand, $(0^* \cup 1^*) = (0^* | 1^*)$ is far from all bitstrings. This expression says that the bitstrings must be either of the form $0^*$ (which is any bitstring with only zeroes) or of the form $1^*$ (which is any bitstring with only ones). This expression does not accept bitstrings with both a zero and a one, and as such is not the same as the $(0 | 1)^*$.

Related Question