Universal properties for kernels and cokernels

abstract-algebracategory-theory

I'm trying to build some intuition about the universal properties for kernels and cokernels by considering such familiar categories as $\mathbf{Set}$ or $\mathbf{Ab}$ (all while going through Aluffi's "Algebra: Chapter 0", so I don't have much categorical intuition yet).

Let's fix $\varphi : M \rightarrow N$ and start with $\ker \varphi$. Injectivity of the injection $\iota : \ker \varphi \rightarrow M$ is an obvious requirement: otherwise we could've taken the kernel to be the whole $M$, so it guarantees that the kernel is "smallest possible" in some sense. But what about the uniqueness of $\sigma : K \rightarrow \ker \varphi$ for any $\alpha : K \rightarrow M$ such that $\varphi \alpha = 0$? What would be a good example of an "almost-kernel" in sets or abelian groups, but such that such $\alpha$s don't factor through it uniquely?

And, more importantly, how one could come up with such an universal property at all if they knew what a kernel is in, for instance, sets or abelian groups? I kinda understand that it describes how much freedom does $\alpha$ have to make $\varphi \alpha = 0$, but I'm still not sure how to come up with the corresponding universal property even with this understanding.

Correspondingly, for $\text{coker} \varphi$ the surjectivity of the projection $\pi : N \rightarrow \text{coker} \varphi$ again ensurs that the cokernel is "smallest possible" (otherwise we could've taken the whole $N$). And, again, the less $\varphi$ is surjective, the more freedom it leaves to $\alpha$ while still having $\alpha \varphi = 0$ (set-theoretically speaking, that's the whole of $N \setminus \text{im} \varphi $ that's up to $\alpha$), so the more of $N$'s structure needs to be kept in the cokernel to define $\alpha$. But, again, how this implies the specific diagram and universal property for cokernels?

And please let me know if any sentences that don't end with a question mark don't make sense — I'd like to know if any intuition I've already built is correct.

Best Answer

I'd like to start with your first question about "almost kernel". The following construction can be brought to the case of "almost cokernel" without much modification.

Suppose $a:A\to M$ is an "almost kernel" and $k$ is the kernel map. By definition of $A$, $\varphi \circ a=0$ and there exists a map (not necessarily unique) $i:\text{Ker }\varphi \to A$ makes the following diagram commute: $$\require{AMScd} \begin{CD} @. \text{Ker } \varphi @>i>> A @.\\ @. @| @VVaV \\ 0 @>>> \text{Ker } \varphi @>k>> M @>\varphi>> N \end{CD}$$ On the other hand, by the universal property of kernel there exists an unique map $r:A\to \text{Ker }\varphi$ makes the following diagram commute: $$\require{AMScd} \begin{CD} A @>r>> \text{Ker } \varphi @.\\ @| @VVkV \\ A @>a>> M @>\varphi>> N \end{CD}$$ We obtain another commutative diagram: $$\require{AMScd} \begin{CD} @. \text{Ker } \varphi @>r\circ i>> \text{Ker }\varphi @.\\ @. @| @VVkV \\ 0 @>>> \text{Ker } \varphi @>k>> M @>\varphi>> N \end{CD}$$ Now by the universal property of kernel again $r\circ i=\text{id}_{\text{Ker}\varphi}$, since the identity also makes the above diagram commute. You can check that the existence of such $r,i$ is, in fact, the necessary and sufficient condition of an almost kernel. Ones can take $i$ to be the inclusion, then $r$ is called a retract of $A$ to $\text{Ker }\varphi$.

Such examples come from direct sums, since the composition of the projection from $M\oplus N$ onto $M$ and the obvious inclusion $M\to M\oplus N$ is the identity on $M$. In this case, you can take $\varphi: \mathbb{Z}\to 0$, then the kernel is $\text{id}:\mathbb{Z}\to \mathbb{Z}$ and an almost kernel is $\text{pr}_1: \mathbb{Z}\oplus \mathbb{Z}\to \mathbb{Z}, (x,y)\mapsto x$. It is easy to see that $r=\text{pr}_1$ and a choice for $i:\mathbb{Z}\to \mathbb{Z}\oplus \mathbb{Z}$ is $x\mapsto (x,0)$. Another choice is $i':x\mapsto (x,x)$.

In the above argument, the use of universal property is essential. The main purpose of universal property is to obtain the uniqueness of the kernel (or cokernel). Suppose that there are two kernel $k:K\to M$ and $k':K'\to M$ of $\varphi:M\to N$, then a similar argument to the one we presented above show that there exists unique $i:K\to K'$ and $j:K'\to K$ such that $k'=k\circ j,k=k'\circ i$ and $i\circ j=\text{id}_{K'},j\circ i=\text{id}_K$. The last two equations say that $K$ and $K'$ are isomorphic, and the previous two equations say that the isomorphisms are natural.

The second advantage of the universal property which is not very clear at the moment is that universal property provides morphisms. This is the case with tensor product, and you will use this frequently when deals with diagrams ("diagram chasing" is the name of this method). Further, the notion of kernel and cokernel is important because they are the building blocks of abelian category. On an abelian category, we can do homological algebra, a ubiquitous computational tool of algebraists.

I think what I wrote above will give you some grasp about how universality work. From your question, I think you have the same problem as mine when I first learn about category theory. The definition of kernel suggests that there are many kernels, but in every application of universal property they treat them like there is only one kernel. From the categorical viewpoint, two objects defined by the same universal property are identical. When there is a kernel, its universal property allows one to only care about the unique factorization. This is quite troublesome at first, but it is indeed an advantage of category theory.