[Math] the difference between Categories and Relations

category-theoryrelations

For a common basis, I'll state basic definitions of a category and the relation type I'm thinking of. They're here for quick clarity, not precision, so feel free to revise for an answer.

Category:
A collection of objects, a collection of arrows each with a source and target among said objects, identity arrows, and a composition operator satisfying an associative law over arrows.

Reflexive relation on AxA:
A collection of objects, a labelled multi-set of pairs of said objects satisfying the reflexive law, and the standard relation combinator operation (which satisfies the associative law).
Note that labelled multi-set relations are commonly expressed and utilized in discrete math as multi-graphs. I collapse graphs and relations here for unified axiomatic treatment. This is not to treat, e.g., paths over graphs as different from relation composition.

What, exactly, separates these two? They appear essentially identical in definition and meaning to me. Relations immediately appear more general, given that categories only analogize to a certain restricted class of relations.
I note some notational difference, such as how categories name each arrow; I don't see how that would change mathematical power. Similarly with the explicit treatment of composition. While such contrivances can come in handy for certain proofs or explorations, I don't see how it justifies treating the two as separate branches rather than syntactic shims on identical concepts.

[EDITS: fixed associativity statement, extended relations to multi-set representation with graph analogy]

Best Answer

Qiaochu is right: categories are much more general. Let me provide a concrete example which, while in a sense trivial, may illustrate just how much more you could get from a category than you can from a reflexive and transitive relation.

Consider the positive integers $\mathbb P$. The usual total order $\leqslant$ is a perfectly good reflexive, transitive relation on this set. We can think of this as a category which consists of the identity map $\mathrm{id}_n$ on each integer n ∈ $\mathbb P$, and the composition of the successor maps $\sigma_n: n \to n+1$. The result is essentially an almost acyclic directed graph (all circuits are stationary walks on a single vertex).

Now let me describe to you a different category on the positive integers: the category of injective linear maps on finite-dimensional real vector spaces — which is to say, the category whose objects are the positive integers, and whose arrows are m × n matrices over $\mathbb R$, with rank n, for m ≥ n.

An aside. "What!?" you may exclaim; "How can you talk about matrices having domains and codomains consisting of positive integers?" Well, the shape of each matrix defines what sorts of other matrices it can be composed with, and any action of a matrix M on a vector v ∈ $\mathbb R^n$ could be viewed as composing the m × n matrix M with the n × 1 matrix v to obtain an m × 1 matrix Mv. The integers stand in for the dimension of the vector space $\mathbb R^n$ or $\mathbb R^m$ which are the "real" domain and codomain of the matrix; and because we can substitute the linear action of operators on $\mathbb R^n$ with the effect of composing two linear operators (one of which happens to have a domain of $\mathbb R^1$), we may ignore the structure inside of the vector spaces and refer to them just by their dimensions.

As with the directed graph generated by the total order ≤, there exist arrows m → n for any m ≥ n (by definition). All of the maps from n to m for m > n+1 can be obtained by composing maps n → n+1 and n+1 → m; and furthermore, all maps n → n+1 can be obtained by composing any single designated map $\sigma_n: n \to n+1$ with a suitable map $\tau: n \to n$. So this category has a number of things in common with the one defined by the total order ≤. But there are important differences, owing in part to the fact that there are infinitely maps n → n for each n. Because of this, this second category has much more structure than a binary relation.

For instance, it becomes meaningful and interesting to talk about coproducts.

Definition. Let A,B be two objects. A coproduct of A and B is an object (A $\amalg$ B), together with two maps iA : A → (A $\amalg$ B) and iB : B → (A $\amalg$ B) such that, for any other object X and maps f : A → X and g : B → X, there exists a unique map (f | g) : (A $\amalg$ B) → X such that f = (f | g) $\circ$ iA and g = (f | g) $\circ$ iB . That is, in the following diagram, any two directed walks between a common pair of points represent the same map:

$\begin{matrix} \qquad\! A \!\!&\!\! \xrightarrow{\textstyle \;i_A\;} \!\!&\!\! (A \amalg B) \!\!&\!\! \xleftarrow{\textstyle \;i_B\;} \!\!&\!\! B \!\qquad \\ \mathrm{id_A}\Bigg\updownarrow & & \Bigg\downarrow(f|g)\!\!\!\!\!\!\!\!\!\!\!\!\!\!\!\!\!\!\!& &\Bigg\updownarrow\mathrm{id_B} \\ \qquad\! A \!\!&\!\! \xrightarrow[\textstyle \quad f \quad] \!\!\!\!\!\!\!\!\!\!&\!\! X \!\!&\!\!\!\!\!\!\!\!\!\! \xleftarrow[\textstyle \quad g \quad] \!\!&\!\! B\!\qquad \end{matrix}$

  1. In the case of the partial order ≤, the coproduct of A and B is just  max { A, B } , which you should verify for yourself.
  2. In the category of matrices we described above, the coproduct of A and B is not the maximum of A and B, but is instead the integer A+B, corresponding to splitting a vector of dimension A+B into a block of size A and a block of size B.

    • The "inclusion" map $i_A$ would correspond to mapping a vector in v ∈ $\mathbb R^A$ into the first A coefficients of a vector in $\mathbb R^{A+B}$, with the others set to zero;
    • The "inclusion" map $i_B$ would be similar, mapping vectors in $\mathbb R^B$ into the final B coefficients.
    • For functions fA → A+B and gB → A+B, the map (f | g) corresponds to performing the map f to the first block and g to the second block of the vector.

    Of course, there's more than one way to decompose $\mathbb R^{A+B}$ into blocks; we could have chosen a different definition for the maps iA and iB, by having iB map vectors in $\mathbb R^B$ into the first B coefficients, and iA map vectors into the final A coefficients. This way of embedding $\mathbb R^A$ and $\mathbb R^B$ into the same space at the same time is equivalent to the one we describe above; in fact we can show that there is a unique isomorphism between these two ways of constructing (A $\amalg$ B). So, the coproduct (A $\amalg$ B) is dependent on the choice of maps iA and iB — but not in any way which is really important.

Even though the first category generated by ≤ is what you get when you forget the matrix structure of the second category — by replacing each infinite collection of arrows in each case with a single arrow between the same points — that second category of injective linear operators is substantially different from the simple one described by ≤.

This is just scratching the surface of one fingernail of category theory; but hopefully it convinces you that categories and reflexive, transitive binary relations are not trivially equivalent.

Related Question