Category Theory – Correct Construction of Coproduct in Monoid

category-theorycoproductmonoid

The following question is taken from Arrows, Structures and Functors the categorical imperative by Arbib and Manes.

$\color{Green}{Background:}$

$\textbf{(1)}$ $\textbf{(Definition 1)}$ A $\textbf{monoid}$ is a set $X$ equipped with a function $\cdot:X\times X\to X$ (with monoid $\textbf{multiplication}$) and a distinguished element (the monoid $\textbf{identity}$) subject to the two laws:

$x\cdot (y\cdot z)=(x \cdot y)\cdot z$ for all $x,y,z$

$x\cdot e=x=e\cdot x$ for all $x.$

We abbreviate $x\cdot y$ to $xy.$

$\textbf{(1)}$ $\textbf{Definition 1:}$ If $X,Y$ are monoids, a function $f:X\to Y$ is a monoid $\textbf{homomorphism}$ if $f$ preserves the monoid structure of multiplication and identities, i.e., if $f(x\cdot x')=f(x)\cdot f(x')$ and $f(e)=e.$

The $\textbf{identity function}$ $\text{id}_X:X\to X$ is surely a monoid homomorphism and the usual $\textbf{composite}$ $g\cdot f: X\to X$ of two functions is a monoid homomorphism if $f:X\to Y$ and $g:Y\to Z.$

$\textbf{(2)}$ $\textbf{Example:}$ from the $\textit{theory of computation:}$

Consider a set $A$ (the 'alphabet') and let $X$ be the set of all $\textit{word on the alphabet A,}$ i.e., $X$ is the set of all strings $a_1,\ldots,a_n$ with each $a_i\in A.$ We do not exclude the case $n=0;$ there is one such string, call it $\wedge,$ the 'empty string'. Define $(a_1\ldots a_n)\cdot (b_1\ldots b_m)=(a_1\ldots a_n, b_1\ldots b_m)$ and define $e=\wedge.$ With this structure, $X$ becomes a monoid. $X$ is denoted by $A^{*},$ and is called $\textbf{free monoid on the set}$ $A$ $\textbf{of generators.}$

$\textbf{(3)}$ $\textbf{Definition 3:}$ $\textit{Coproducts in}$ $\textbf{Mon}$ or $\textbf{Grp}$

The one element monoid/group $\{e\}$ is clearly initial object-that is, the empty coproduct-in both categories.

Let now $\{X_i\mid i\in I\}$ be a family of monoids with $I$ non-empty. Let $A$ (for 'alphabet') be the disjoint union of the sets $X_i.$ Let $W$ be the set $A*$ of all words on the alphabet $A.$ A word $(a_1,\ldots,a_n)$ is $\textit{reduced}$ if for all $j<n, i_j\neq i_{j+1}$ and for all $j\leq, x_j\neq e_j$ (where $a_j=(x_j,i_j)$ and $e_j$ is the unit of $X_{i_j}).$ If a word is not reduced it has a $\textit{reduction}$ obtained by continuing to apply the following rules:

$\textbf{(a)}$ Multiply adjacent pairs with the same $i-$term:

$(a_1,\ldots,a_j=(x_j,i),a_{j+1}=(x_{j+1},i),\ldots,a_n)\mapsto (a_1,\ldots,\hat{a}_j=(x_jx_{j+1},i),a_{j+2},\ldots,a_n)$

$\textbf{(b)}$ Delete identity terms:

$(a_1,\ldots,a_j=(e_j,i_j),\ldots a_n)\mapsto (a_1,\ldots,a_{j-1},a_{j+1},\ldots,a_n)$

until the word is reduced. It is obvious that such a process has a most $n$ steps and results in a reduced word. The monoid axioms guarantee that the end result is independent of the way the rules are applied (why?). Let $C$ be the set of all reduced words in $W.$ Define a multiplication on $C$ by

$$(a_1,\ldots,a_n)(b_1,\ldots,b_m)=\text{reduction of }(a_1,\ldots,a_n,b_1,\ldots,b_m).$$

Then $C$ is a monoid with unit $\wedge.$ Define injection homomorphisms

$$\text{in}_i:X_i\to C \quad x_i\mapsto\text{reduction of }(x_i,i)$$

These $\textit{are}$ monoid homomorphism by the reduction rules. To check that $(\text{in}_i:X_i\to C\mid i\in I)$ is indeed a coproduct for the monoids $X_i$ we must check that whenever $f_i:X_i\to D$ are monoid homomorphisms, the diagram

enter image description here

commutes for a unique monoid homomorphism $f.$ But for $f$ to be a homomorphism it clearly must satisfy $f(\wedge)=e,$ while, for $n>0, f(a_1,\ldots,a_n)=f_{i_1}(x_1)\ldots f_{i_n}(x_n)$ where $a_j=(x_j,i_j).$ But this determines $f$ as a function. The proof that this $f$ is a monoid homomorphism is left as a routine exercise.

$\color{Red}{Questions:}$

I have some questions about how one goes about constructing the coproduct for monoid, specifically my questions have to do with:

$\color{blue}{1)}$ elements in the disjoint union of sets,

$\color{blue}{2)}$ how the word reduction rule works in particular instance,

$\color{blue}{3)}$ Misprint in the definition of injection homomorphism

$\color{blue}{4)}$ what the maps $\text{in}_i, f_i$ and $f$ look like for particular values of $i$ and $j$

For $\color{blue}{1)}$ In the beginning second paragraph of $\textbf{(3)}$ $\textbf{Definition 3:}$ where it says:

"Let now $\{X_i\mid i\in I\}$ be a family of monoids with $I$ non-empty. Let $A$ (for 'alphabet') be the disjoint union of the sets $X_i.$"

There is the multiplication rule after applying word reduction:

$(a_1,\ldots,a_n)(b_1,\ldots,b_m)=\text{reduction of }(a_1,\ldots,a_n,b_1,\ldots,b_m).$

Does that mean: $(a_1,\ldots,a_n)\in X_1$ and $(b_1,\ldots,b_m)\in X_2,$ $(a_1,\ldots,a_n)(b_1,\ldots,b_m)\in X_1\sqcup X_2,$ and $\text{reduction of }(a_1,\ldots,a_n,b_1,\ldots,b_m)\in C?$

For $\color{blue}{2)}$ with the rules for word reduction, with as an example, $I=\{1,2\},$ $i=1,$ $n=10$, and $j=5,$ for $\textbf{(3)}$$\textbf{(a)}$

$(a_1,\ldots,a_5=(x_5,1),a_6=(x_6,1),\ldots,a_{10})\mapsto (a_1,\ldots,\hat{a}_5=(x_5x_6,1),a_7,\ldots,a_{10})$

with $i=1,2,$ $n=10,$ and $j=8.$ for $\textbf{(3)}$$\textbf{(b)}$

$(a_1,\ldots,a_8=(e_8,1_8),\ldots a_{10})\mapsto (a_1,\ldots,a_7,a_9,\ldots,a_{10})$

for $\color{blue}{3)}$ In $\textbf{(3)},$ Where the injection homomorphisms is defined:

"$\text{in}_i:X_i\to C \quad x_i\mapsto\text{reduction of }(x_i,i)$"

$\color{purple}{I}$ $\color{purple}{think}$ $\color{purple}{there}$ $\color{purple}{might}$ $\color{purple}{be}$ $\color{purple}{a}$ $\color{purple}{misprint.}$ I think it should be

$$\text{in}_i:X_i\to C \quad x_j\mapsto\text{reduction of }(x_j,i_j),$$

because the $i$ in $\text{in}_i, (x_j,i_j)$ refers to which $X_i$ the $(a_1,a_2,\ldots,a_n)$ belongs to, and for the $j$s in $x_j, a_j=(x_j,i_j)$ refers to the specific $a_j, a_j=x_j-$th terms where $1\leq j \leq n.$

For $\color{blue}{4)}$ as a specific example with $i=1,2,$ $n,m=10,$ and as above for $\textbf{(3)}$ $\textbf{(a)}$ $j=5,$ for multiplying adjacent terms and $\textbf{(3)}$$\textbf{(b)}$ $j=8$ for deletion of terms:

$(a_1,\ldots,a_{10})(b_1,\ldots, b_{10})\mapsto (a_1,\ldots,\hat{a}_5=(x_5x_6,1),a_7,a_9,a_{10})(b_1,\ldots,\hat{b}_5=(x_5x_6,1),b_7,b_9,b_{10})=(a_1,\ldots,\hat{a}_5,a_7,a_9,a_{10},b_1,\ldots,\hat{b}_5,b_7,b_9,b_{10})$

For the injection maps, which is also a homomorphism.

$\text{in}_i:X_i\to C,$ $C=X_1\sqcup X_2,$

for $i=1,j=5$ for $\textbf{(3)},$ $\textbf{(a)},$ $\text{in}_1:X_1\to X_1\sqcup X_2:\text{in}_1(a_1,\ldots, a_{10})=(a_1,\ldots,\hat{a}_5,a_7,a_9,a_{10})=((x_1,1_1),\ldots,(x_5,1_5),(x_7,1_7),(x_9,1_9),(x_{10},1_{10})),$ $a_j,(x_j,1_j)$ are both elements of $X_1$

and for

$i=2,$ for $x'_j\in X_2,$ $j=8$ for $\textbf{(3)}$ $\textbf{(b)},$ $\text{in}_2:X_2\to X_1\sqcup X_2:\text{in}_2(b_1,\ldots, b_{10})=(b_1,\ldots,\hat{b}_5,b_7,b_9,b_{10})=((x'_1,2_1),\ldots,(x'_5,2_5),(x'_7,2_7),(x'_9,2_9),(x'_{10},2_{10})),$ $b_j,(x'_j,2_j)$ are both elements of $X_2$

Then the homomorphic maps $f,$ $f\circ \text{in}_i=f_{i_j}$ defined in terms of elements are as follows:

with $i\in I,$ $f(a_1,a_2,\ldots,a_n)=f(a_1)f(a_2)\cdots f(a_n)=f_{i_1}(x_1)f_{i_2}(x_2)\cdots f_{i_n}(x_n)$

and for $I=\{1,2\},$

$f((a_1,a_2,\ldots,a_n)(b_1,b_2,\ldots,b_m))=(f(a_1)f(a_2)\cdots f(a_n), f(b_1)f(b_2)\cdots f(b_m))=(f_{1_1}(a_1)f_{1_2}(a_2)\cdots f_{1_n}(a_n), f_{2_1}(b_1)f_{2_2}(b_2)\cdots f_{2_m}(b_m))$

and for $i=1,n=10,j=5$ for $\textbf{(3)},$ $\textbf{(a)},$ and $j=8$ for $\textbf{(3)}$ $\textbf{(b)},$

$(f\circ \text{in}_1)(a_1,\ldots, a_{10})=f(\text{in}_1(a_1,\ldots, a_{10}))=f_{1_1}(a_1)\ldots f_{1_5}(\hat{a}_5)f_{1_7}(a_7)f_{1_9}(a_9)f_{1_{10}}(a_{10})$

but, and so,

$(a_1,\ldots,\hat{a}_5,a_7,a_9,a_{10})=((x_1,1_1),\ldots,(x_5,1_5),(x_7,1_7),(x_9,1_9),(x_{10},1_{10})),$ $a_j=(x_j,i_j),$

$f(a_1)\ldots f(\hat{a}_5)f(a_7)f(a_9)f(a_{10})=f(a_1)\ldots f(\hat{a}_5)f(a_7)f(a_9)f(a_{10})=f_{1_1}(x_1)\ldots f_{1_5}(\hat{x}_5)f_{1_7}(x_7)f_{1_9}(x_9)f_{1_{10}}(x_{10}))$

Finally, for $I=\{1,2\},n,m=10,$ $j=5$ for $\textbf{(3)},$ $\textbf{(a)},$ and $j=8$ for $\textbf{(3)}$ $\textbf{(b)},$ we finally have:

$(f\circ \text{in}_i)((a_1,a_2,\cdots,a_{10})(b_1,b_2,\cdots,b_{10}))=(f\circ \text{in}_1(a_1,a_2,\cdots,a_{10})f\circ \text{in}_2(b_1,b_2,\cdots,b_{10}))=(f(a_1,\ldots,\hat{a}_5,a_7,a_9,a_{10}),f(b_1,\ldots,\hat{b}_5,b_7,b_9,b_{10})=(f_{1_1}(x_1)\ldots f_{1_5}(\hat{x}_5)f_{1_7}(x_7)f_{1_9}(x_9)f_{1_{10}}(x_{10})
,f_{2_1}(x'_1)\ldots f_{1_5}(\hat{x}'_5)f_{1_7}(x'_7)f_{1_9}(x'_9)f_{1_{10}}(x'_{10})
)$

Thank you in advance

Best Answer

Let's start by fixing some notation: I'll call $X \sqcup Y$ the disjoint union of sets (which is an instance of a categorical coproduct, in the category $\mathbf{Set}$ of sets and functions). I'll use $A + B$ for coproducts in a general category, e.g. the category $\mathbf{Mon}$ of monoids or the category $\mathbf{Grp}$ of groups. So, it'd make perfect sense to write something like

In the category $\mathbf{Set}$ of sets, the coproduct is given by the disjoint union: $X + Y := X \sqcup Y$.

Definition of coproducts of monoids

Now, assume you're given two monoids $\mathbf{A} = (A, \cdot, u)$ and $\mathbf{B} = (B, \times, e)$. We call $A$ the carrier of $\mathbf{A}$ (and similarly for $B$ and $\mathbf{B}$). Before going through the construction of their coproduct $\mathbf{A} + \mathbf{B}$, it might be instructive to see why its carrier cannot be $A \sqcup B$.

What goes wrong with plain disjoint unions?

For this to be the case, we'd need to define two things:

  • An operation $\bullet : (A \sqcup B) \times (A \sqcup B) \to (A \sqcup B)$
  • An element $1 \in A \sqcup B$

Let's start looking at the latter. What should we choose? There are two natural candidates: $(u, 1)$ the identity from $\mathbf{A}$ and $(e, 2)$ the identity from $\mathbf{B}$. Let's put a pin on this problem: we'll get back to it in a second.

We'd also need to define $x \bullet y$ for any two elements $x, y \in A \sqcup B$. What should it be? There are four possible cases:

  1. $x = (x_1, 1)$ and $y = (y_1, 1)$;
  2. $x = (x_2, 2)$ and $y = (y_2, 2)$;
  3. $x = (x_1, 1)$ and $y = (y_2, 2)$;
  4. $x = (x_2, 2)$ and $y = (y_1, 1)$.

It's clear what $x \bullet y$ should be in the first two cases: $(x_1 \cdot y_1, 1)$ and $(x_2 \times y_2, 2)$ respectively. It's also immediate to see that case 1 requires the identity element to be $(u, 1)$, while case 2 requires the identity to be $(e, 2)$. But the identity cannot be both at the same time!

Another problem arises for cases 3 and 4: what should we pick? Remember, we don't just want to chose any value $x \bullet y$ that makes $A \sqcup B$ a monoid: we're trying to define a coproduct, and coproducts satisfy a universal property! Intuitively, we'd want to make the most general definition, and there doesn't seem to be a reasonable choice.

What we'd want to do is to expand the set $A \sqcup B$, so that $x \bullet y$ is a new element whenever we're in case 3 or 4, but have it be what we described above in cases 1 and 2. And since we're at it, we'd want $(u, 1)$ and $(e, 2)$ to be the same element, which will then be the identity element.

The correct construction

We do this by following the recipe you describe: take $(A \sqcup B)^*$ to be the set of words whose letters are either elements of $A$ or of $B$, and then take $C \subset (A \sqcup B)^*$ to be the set of reduced such words. The claim is that we can define $\bullet$ and $1$, so that $(C, \bullet, 1)$ is a monoid, and moreover it satisfies the universal property of coproducts: $\mathbf{A} + \mathbf{B} = (C, \bullet, 1)$.

The first thing to notice is that now, niehter $[(u, 1)]$ nor $[(e, 2)]$ are reduced; the reduction of both is the empty word $[]$. So it would make sense to define $1 := []$, which is exactly what we do.

Then, we can also notice that $(A \sqcup B)^*$ is already a monoid, but it doesn't take into account the monoid structures of $\mathbf{A}$ or $\mathbf{B}$: we just take concatenation of words. Not only that: $1$ is the identity with respect to such an operation!

So we might guess that to define $x \bullet y$ for $x, y \in C$, we first want to concatenate them (obtaining an element $x y \in (A \sqcup B)^*$ which might fail to be reduced) and then reduce it to an element of $C$.

A concrete example

Here I'll drop the indices, since it will (hopefully!) be clear where each element comes from.

Before proving this is indeed a coproduct, let's look at a concrete example: let's pick $\mathbf{A} := (\mathbb{Z}, +, 0)$ the integers under addition, and $\mathbf{B} := (\{a, b\}^*, \cdot, \epsilon)$ the set of words on two letters, under concatenation (writing $\epsilon$ for the empty word). How do the reduction rules and operation on $\mathbf{A} + \mathbf{B}$ work?

Let's look an element which is not reduced: $x = [ab, 12, \epsilon, -7, ba]$. What is the corresponding element in $\mathbf{A} + \mathbf{B}$? First we remove the empty word, since it is the identity in $\mathbf{B}$, obtaining $x\prime = [ab, 12, -7, ba]$; then we notice that two adjacent elements both come from $\mathbf{A}$, so we perform the operation on them obtaining $x^{\prime \prime} = [ab, 5, ba]$. We can now see that what we got is reduced: it is indeed an element of $\mathbf{A} + \mathbf{B}$.

Now, let's look at $\bullet$. Fix two reduced elements, say $x = [ab, 5]$ and $y = [-5, ba, -2]$. To compute $x \bullet y$ we first concatenate them obtaining $xy = [ab, 5, -5, ba, -2]$ and then reduce it to $x \bullet y = [abba, -2]$ (by first adding 5 and -5 and obtaining 0, then removing it, then concatenating $ab$ and $ba$ obtaining $abba$).

This is indeed a coproduct!

Let's fix monoids $\mathbf{A}, \mathbf{B}$. To show that our construction is indeed a coproduct, what we need to show is that for any monoid $\mathbf{C}$, giving monoid homomorphisms $f : \mathbf{A} \to \mathbf{C}$, $g : \mathbf{B} \to \mathbf{C}$ is the same thing as giving a monoid homomorphism $h : \mathbf{A} + \mathbf{B} \to \mathbf{C}$. Let's do this:

  • $f, g$ determine $h$: given an element $x = [x_1, \dots, x_n]$ which is not necessarily reduced, we can define $h(x)$ as follows. First, since $h$ is to be a monoid homomorphism, we'd want $h(x) = h([x_1]) \cdot \dots \cdot h([x_n])$, so we only need to define $h(x_i)$. But we know that either $x_i = (a_i, 1)$ or $x_i = (b_i, 2)$: we can then define $h(x_i)$ to be $f(a_i)$ if we're in the former case, and $g(b_i)$ in the latter.

Notice that at this point, we still don't know that $h$ is a monoid homomorphism (and we haven't used the fact that $f, g$ are monoid homomorphisms). Formally we'd need to work with the reduction rules, but here I'll just sketch the relevant argument (and you can assemble it into a rigorous proof). Consider $x = [(x^\prime, 1)]$ and $y = [(y^\prime, 1)]$: is it the case that $h(x) \cdot h(y) = h(x \bullet y)$? Well, we can unwind our definition. $h(x) \cdot h(y) = f(x^\prime) \cdot f(y^\prime) = f(x^\prime \cdot y^\prime) = h([(x^\prime \cdot y^\prime, 1)]) = h(x \bullet y)$ (assuming $x^\prime \cdot y^\prime$ is not the identity, of course; what happens in that case?). The same applies for $g$.

  • $h$ determines $f, g$: for this we can just define inclusions $in_1 : \mathbf{A} \to \mathbf{A} + \mathbf{B}$ by setting $in_1(a) = [(a, 1)]$ if $a$ is not the identity, and $[]$ if it is. The fact this is a monoid homomorphism follows from the reduction rules (we can just compute $in_1(a_1) \bullet in_1(a_2) = [(a_1, 1)] \bullet [(a_2, 1)] = [(a_1 \cdot a_2, 1)] = in_1(a_1 \cdot a_2)$), and we can define $f = h \circ in_1$. A similar argument involving $in_2(b) := [(b, 2)]$ applies for $g$.

Questions

We're now ready to answer your questions.

  1. The answer to this one is no. That is, if $a_1, \dots, a_n \in A$ this doesn't mean $[a_1, \dots, a_n] \in A$. Their product is, but the sequence isn't! On the other hand, the last part of your question is correct: the reduction of $[a_1, \dots, a_n, b_1, \dots, b_m]$ is an element of carrier of $\mathbf{A} + \mathbf{B}$.
  2. The way you apply the reduction rules seems ok to me.
  3. I don't think there's a misprint, though the notation might be slightly confusing: a less confusing way to write it might be $in_i : X_i \to C$ sending $x \mapsto in_i(x) :=$ reduction of $[(x, i)]$. In other words, it takes one element $x \in X_i$ to the one-element, reduced sequence with just $x$ marked by the index of the monoid $X_i$ it comes from (unless $x$ is an identity, in which case it gets sent to the empty sequence). I'll stress one thing though: $X_i$'s elements are not sequences (as per my answer to question 1)!
  4. I'm not really sure what you're doing there, but I'll have to stress the fact that $C$ is not $X_1 \sqcup X_2$: it's a subset of the free monoid on the disjoint union: $C \subseteq (X_1 \sqcup X_2)^*$.

Some general remarks

You seem to confuse monoids with free monoids: elements of a monoid are not sequences in general, that's only the case if the monoid is free over some set (what you call the set of words over some alphabet).

In the case of free monoids, by the way, your intuition seems to be kind of correct: fix (disjoint) alphabets $A$ and $B$, the coproduct of the free monoids $A^* + B^*$ is the free monoid over the disjoint union $(A \sqcup B)^*$.

Another thing I'd like to remark is that in the mathematical literature, people will usually define the carrier of the coproduct of monoids in a slightly different way. Fix two monoids $\mathbf{A}$, $\mathbf{B}$; take the free monoid over the disjoint union of the carriers $(A \sqcup B)^*$ and then define an equivalence relation $\sim$ on it: two words are related iff they reduce to the same word. It's then relatively painless to show that this relation interacts nicely with the monoid structure (that is, it is a congruence), so we can take the quotient monoid, and then prove that it satisfies the universal property of coproducts.

What we're doing here is to fix a canonical representative for each equivalence class (the element which is already reduced) and presenting this quotient as a subset instead.

Related Question