[Math] Why worry about the axiom of choice

axiom-of-choicelo.logicset-theory

As I understand it, it has been proven that the axiom of choice is independent of the other axioms of set theory. Yet I still see people fuss about whether or not theorem X depends on it, and I don't see the point. Yes, one can prove some pretty disturbing things, but I just don't feel like losing any sleep over it if none of these disturbing things are in conflict with the rest of mathematics. The discussion seems even more moot in light of the fact that virtually none of the weird phenomena can occur in the presence of even mild regularity assumptions, such as "measurable" or "finitely generated".

So let me turn to two specific questions:

If I am working on a problem which is not directly related to logic or set theory, can important mathematical insight be gained by understanding its dependence on the axiom of choice?

If I am working on a problem and I find a two page proof which uses the fact that every commutative ring has a maximal ideal but I can envision a ten page proof which circumvents the axiom of choice, is there any sense in which my two page proof is "worse" or less useful?

The only answer to these questions that I can think of is that an object whose existence genuinely depends on the axiom of choice do not admit an explicit construction, and this might be worth knowing. But even this is largely unsatisfying, because often these results take the form "for every topological space there exists X…" and an X associated to a specific topological space is generally no more pathological than the topological space you started with.

Thanks in advance!

Best Answer

The best answer I've ever heard --- and I think I heard it here on MathOverflow from Mike Shulman, which suggests that this question is roughly duplicated somewhere else --- is that you should care about constructions "internal" to other categories:

  1. For many, many applications, one wants "topological" objects: topological vector spaces, topological rings, topological groups, etc. In general, for any algebraic gadget, there's a corresponding topological gadget, by writing the original definition (a la Bourbaki) entirely in terms of sets and functions, and then replacing every set by a topological space and requiring that every function be continuous.
  2. A closely related example is that you might want "Lie" objects: sets are replaced by smooth manifolds and functions by smooth maps.
  3. Another closely related example is to work entirely within the "algebraic" category.

In all of these cases, the "axiom of choice" fails. In fact, from the internal-category perspective, the axiom of choice is the following simple statement: every surjection ("epimorphism") splits, i.e. if $f: X\to Y$ is a surjection, then there exists $g: Y \to X$ so that $f\circ g = {\rm id}_Y$. But this is simply false in the topological, Lie, and algebraic categories.

This leads to all sorts of extra rich structure if you do algebra internal to these categories. You have to start thinking about bundles rather than products, there can be "anomalies", etc.

Update:

In the comments, there was a request for a totally explicit example, where Axiom of Choice is commonly used but not necessary. Here's one that I needed recently. Let $\mathcal C$ be an abelian tensor category, by which I mean that it is abelian, has a monoidal structure $\otimes$ that is biadditive on hom-sets, and that has a distinguished natural isomorphism $\text{flip}: X\otimes Y \overset\sim\to Y\otimes X$ which is a "symmetry" in the sense that $\text{flip}^2 = \text{id}$. Then in $\mathcal C$ is makes sense to talk about "Lie algebra objects" and "associative algebra objects", and given an associative algebra $A$ you can define a Lie algebra by "$[x,y] = xy - yx$", where this is short-hand for $[,] = (\cdot) - (\cdot \circ \text{flip})$ — $x,y$ should not be read as elements, but as some sort of generalization. So we can makes sense of the categories of $\text{LIE}_{\mathcal C} = $"Lie algebras in $\mathcal C$" and $\text{ASSOC}_{\mathcal C} = $"associative algebras in $\mathcal C$", and we have a forgetful functor $\text{Forget}: \text{ASSOC}_{\mathcal C} \to \text{LIE}_{\mathcal C}$.

Then one can ask whether $\text{Forget}$ has a left adjoint $U: \text{LIE}_{\mathcal C} \to \text{ASSOC}_{\mathcal C}$. If $\mathcal C$ admits arbitrary countable direct sums, then the answer is yes: the tensor algebra is thence well-defined, and so just form the quotient as you normally would do, being careful to write everything in terms of objects and morphisms rather than elements. In particular, if $\mathfrak g \in \text{LIE}_{\mathcal C}$, then $U\mathfrak g \in \text{ASSOC}_{\mathcal C}$ and it is universal with respect to the property that there is a Lie algebra homomorphism $\mathfrak g \to U\mathfrak g$.

Let's say that $\mathfrak g$ is representable if the map $\mathfrak g \to U\mathfrak g$ is a monomorphism in $\text{LIE}_{\mathcal C}$. By universality, if there is any associative algebra $A$ and a monomorphism $\mathfrak g \to A$, then $\mathfrak g \to U\mathfrak g$ is mono, so this really is the condition that $\mathfrak g$ has some faithful representation. The statement that "Every Lie algebra is representable" is normally known as the Poincare-Birkoff-Witt theorem.

The important point is that the usual proof — the one that Birkoff and Witt gave — requires the Axiom of Choice, because it requires picking a vector-space basis, and so it works only when $\mathcal C$ is the category of $\mathbb K$ vector spaces for $\mathbb K$ a field, or more generally when $\mathcal C$ is the category of $R$-modules for $R$ a commutative ring and $\mathfrak g$ is a free $R$-module, or actually the proof can be made to work for arbitrary Dedekind domains $R$. But in many abelian categories of interest this approach is untenable: not every abelian category is semisimple, and even those that are you often don't have access to bases. So you need other proofs. Provided that $\mathcal C$ is "over $\mathbb Q$" (hom sets are $\mathbb Q$-vector spaces, etc.), a proof that works constructively with no other restrictions on $\mathcal C$ is available in

  • Deligne, Pierre; Morgan, John W. Notes on supersymmetry (following Joseph Bernstein). Quantum fields and strings: a course for mathematicians, Vol. 1, 2 (Princeton, NJ, 1996/1997), 41--97, Amer. Math. Soc., Providence, RI, 1999. MR1701597.

They give a reference to

  • Corwin, L.; Ne'eman, Y.; Sternberg, S. Graded Lie algebras in mathematics and physics (Bose-Fermi symmetry). Rev. Modern Phys. 47 (1975), 573--603. MR0438925.

in which the proof is given when $\mathcal C$ is the category of modules of a (super)commutative ring $R$, with $\otimes = \otimes_R$, and, importantly, $2$ and $3$ are both invertible in $R$. [Edit: I left a comment July 28, 2011, below, but should have included explicitly, that Corwin--Ne'eman--Sternberg require more conditions on $\mathcal C$ than just that $2$ and $3$ are invertible. Certainly as stated "PBW holds when $6$ is invertible" is inconsistent with the examples of Cohn below.]

Finally, with $R$ an arbitrary commutative ring and $\mathcal C$ the category of $R$-modules, if $\mathfrak g$ is torsion-free as a $\mathbb Z$-module then it is representable. This is proved in:

  • Cohn, P. M. A remark on the Birkhoff-Witt theorem. J. London Math. Soc. 38 1963 197--203. MR0148717

So it seems that almost all Lie algebras are representable. But notably Cohn gives examples in characteristic $p$ for which PBW fails. His example is as follows. Let $\mathbb K$ be some field of characteristic $p\neq 0$; then in the free associative algebra $\mathbb K\langle x,y\rangle$ on two generators we have $(x+y)^p - x^p - y^p = \Lambda_p(x,y)$ is some non-zero Lie series. Let $R = \mathbb K[\alpha,\beta,\gamma] / (\alpha^p,\beta^p,\gamma^p)$ be a commutative ring, and define $\mathfrak g$ the Lie algebra over $R$ to be generated by $x,y,z$ with the only defining relation being that $\alpha x = \beta y + \gamma z$. Then $\mathfrak g$ is not representable in the category of $R$-modules: $\Lambda_p(\beta y,\gamma z)\neq 0$ in $\mathfrak g$, but $\Lambda_p(\beta y,\gamma z)= 0$ in $U\mathfrak g$.