[Math] Can anyone explain why language concatenation works like this

regular expressions

For a language S, an empty string $\epsilon$, and an empty language {},

S $\cdot$ {$\epsilon$} = S = {$\epsilon$} $\cdot$ S

but

S $\cdot$ {} = {} = {} $\cdot$ S

(where $\cdot$ denotes concatenation)


I understand why concatenating a language with another language that has the empty string in it acts like an identity, but why is concatenating the empty language analogous to multiplying by zero?

Is it because the set of strings that result from concatenating something with nothing is nothing? (as opposed to concatenating with an empty string, which I suppose is an empty something?)

Best Answer

Your idea of concatenating something with nothing is on the right track. The reason $A\cdot\{\}=\{\}$ is because the process of language concatenation (as opposed to string concatenation) is defined in such a way as to require that any string of the resulting language is constructed from strings of both original languages. This question is much the same as yours.

The situation here is analogous to the Cartesian product of two sets, which is defined as:

$$A\times B=\{(a,b)\;|\;a\in A,\;b\in B\}$$

which effectively says: the set of every ordered pair that can be formed from a first element in $A$ and a second element in $B$. Now it should be obvious from this definition that a Cartesian product $A\times\emptyset$ (where $\emptyset$ is the empty set) will just be the empty set since there are no $b\in\emptyset$ to choose in order to form the $(a,b)$. It doesn't matter how many elements $A$ has, if there are no elements in $B$ then we cannot form any ordered pairs whose second elements are from $B$. This is further explained more mathematically here.

The relevance of this to your question is as follows: Language concatenation is defined in much the same way as the Cartesian product:

$$A\cdot B=\{ab\;|\;a\in A,\;b\in B\}$$

in which $ab$ is a string formed from concatenating a string from $A$ with a string from $B$. We can argue as follows: when trying to construct $A\cdot\{\}$ we have no strings $b\in\{\}$ to use in order to form $ab$, so we cannot form any such $ab$, and hence there will be no strings in the resulting language, and hence $A\cdot\{\}=\{\}$. If there are no strings to concatenate with, there will be no strings in the concatenated language.

This is a very different idea to concatenating $A\cdot\{\epsilon\}$, in which case it is easy to see that:

$$A\cdot\{\epsilon\}=\{ab\;|\;a\in A,\;b\in \{\epsilon\}\}=\{a\epsilon\;|\;a\in A\}=A$$

which is why concatenating with $\{\epsilon\}$ is an identity operation. In this case there is a string in the language we are concatenating with to choose (namely the empty string).

Related Question