Formal language Concatenation is a binary operation

formal-languages

In formal Language theory does the concatenation of two strings can be seen as a binary operation?
If so, what are the domain and codomain?

You might think that if given $\mathcal{A}$ alphabet then $W = \{x : x \text{ is a string} \}$ then $* : W \times W \longrightarrow W$ is the concatenation. The problem is that how do you build $W$?

From the book I'm reading $W = \bigcup_{i =1}^{\infty} W_i$ with $W_0 = \{\lambda\}$ and $W_i = \{ x : |x| = i \}$ with $|x|$ the length if $x$.

$W$ is built as the union of all the sets that has strings of any lenght.

Then can we see $|.|$ as a function? It is clearly that the codomain is $\mathbb{N}$. But what is the domain? $W$?!!

I guess the question is whether is the concatenation $*$ or the length $|.|$, are they maps, functions, morphisms or not?? what are the domain and codomain of both?

Best Answer

Given an alphabet $A$ you can construct the set of words of length $n$ as $W_n = A^n$, the cartesian product of $n$ copies of $A$ with itself. Then the set of (finite) words

$$W = \bigsqcup_{n \in \mathbb{N}} W_n = \bigsqcup_{n \in \mathbb{N}} A^n$$

can be constructed as the disjoint union of these (so here for us $\mathbb{N}$ includes zero). Concatenation is a binary operation $W \times W \to W$ on $W$ built out of taking the cartesian product, and length is a function $W \to \mathbb{N}$ defined by the property that it takes the value $n$ on $W_n$.

(This exhibits $W$ as the free monoid on $A$.)

I'm not quite sure what you're confused about but I think it's that the statements you've written down, although true, are circular as a definition of the set of words, since $W_n$ is defined in terms of the set of words already. The above is a non-circular definition in set theory.

Edit: You might also wonder whether we need to say "disjoint union" above or whether we can just say "union." The issue is that depending on your precise construction of the cartesian product you may need to check whether or not $A^n$ and $A^m$ for $n \neq m$ are disjoint as sets. Even if that happened to be true it would be a silly and irrelevant implementation detail, it's not "supposed" to happen. Using the disjoint union lets us ignore this; the result does not meaningfully depend on the precise construction of the cartesian product, in the sense that given any two such constructions the corresponding versions of $W$ are naturally in bijection with each other.