Category Theory – Equivalence of Morphisms Under Substitution

category-theory

Let me introduce my original example and then we can generalize.

In a category with products, we have the diagonal morphism (WP, nLab) dup and the pairing operator pair(-, -) satisfying the point-free equivalence:

$$
pair(id,id) \cong dup : X \to X \times X
$$

Given any particular morphism $f : X \to Y$, we can also consider:

$$
pair(f,f) \cong dup \circ f : X \to Y \times Y
$$

But now let's add free monoids (WP, nLab) ("lists") and consider the nil and cons morphisms:

$$
nil : 1 \to [X]
$$

$$
cons : X \times [X] \to [X]
$$

This seems fine, but they can be arranged like so:

$$
cons \circ pair(nil, nil) : 1 \to [[X]]
$$

The inner pair can't be replaced with dup because the types don't line up; the two nil can't be unified. Indeed, cons ∘ dup is never well-typed, because cons takes inputs of different types but dup returns outputs of identical types. What went wrong here?

More generally, suppose we have an equivalence of polymorphic families of morphisms: For some objects $X, Y, \dots$, we have a diagram over those objects. Each commuting pair of morphisms in that diagram should be equivalent, even if we paste it into a larger diagram. But not all equivalences type-check under substitution. Why not?

Best Answer

What is happening is that the example is substituting notation, but using notation that does not correspond uniquely to the actual objects that are subject to substitution. So, an equation is proved for one meaning of the notation, then used with other meanings.

The rule is:

$$\newcommand{\pair}{\mathsf{pair}} \newcommand{\dup}{\mathsf{dup}} \newcommand{\nil}{\mathsf{nil}} \newcommand{\cons}{\mathsf{cons}} ∀(f : X → Y). \pair(f,f) = \dup \circ f$$

Type theoretically, to use this with $\nil : Π_X. 1 → [X]$, you must instantiate it before you can use the rule. So every occurrence of $f$ is instantiated the same way, but in the $\cons$ example that is not the case. Just because instantiation is implicit doesn't mean it is unnecessary to keep track of it. Arguably it is difficult to do so by hand. That's where computers can help.

One could also try to prove rules for polymorphic functions. In that case, I think there are multiple possibilities, and none of them are principal (all possibilities are instantiations of a principal one) without adding more features. But we could do (adding a type ascription to make things more clear):

$$∀(f : Π_X. Y → X). \pair(f,f) = \dup \circ f : Π_X. Y → X×X$$

Here the rule is still instantiating $f$ to the same $X$ every time, because it (presumably) must to get everything to work out. You can still only apply this in situations where $f$ is instantiated the same way in both $\pair$ occurrences, which is not the case for the $\cons$ example.

We could also potentially apply the first rule with $\nil : 1 → Π_X. [X]$ to get:

$$\pair(\nil,\nil) = \dup \circ \nil : 1 → (Π_X. [X]) × (Π_X. [X])$$

And maybe this version could work if the system were to recognize that $$\cons : (Π_X. [X])×(Π_X. [[X]]) → Π_X.[[X]]$$ were a valid subtyping of $\cons$. In fact, the above polymorphic rule could be made to work with enough effort, because:

$$Π_X. 1 → [X] × [[X]] \cong \\ 1 → Π_X. [X] × [[X]] \cong \\ 1 → (Π_X. [X]) × (Π_X. [[X]])$$

However, handling situations like this automatically can get very complicated/impossible, and it's not always possible to commute things like this ($Π$ and products behave nicely together).

The answer is roughly the same if you replace 'polymorphism' with 'naturality' for category theory, and use a notation that conflates natural transformations with individual arrows making them up. Equations might only hold up to the implicit details, and that means they cannot be used to substitute at the level of notation if the equations are derived only for specific meanings of that notation.

Related Question