Why are we confident in the ability of ZFC to formalise mathematics if very few proofs are actually converted into ZFC

elementary-set-theoryfoundationslogicphilosophy

$\mathsf{ZFC}$ is often introduced in logic textbooks as a first-order theory with equality and a single non-logical symbol $\in$. However, even stating the axioms of $\mathsf{ZFC}$ in this language is a cumbersome task: for instance, while the axiom of choice is often written as
$$
\forall X \left( \varnothing \notin X \implies \exists f \colon X \rightarrow \bigcup X \quad \forall a \in X \, ( f(a) \in a ) \right)
$$

this uses the symbols $\varnothing$, $\bigcup$, and function notation, and the "official" statement of the axiom of choice is a much longer string of symbols.

It seems that even statements which feel utterly trivial take many lines to be proven; for instance, in this post, there is a 25 line natural deduction proof that the empty set is unique. As far as I know, very few proofs in ordinary mathematics have actually been converted into formal proofs in $\mathsf{ZFC}$. For instance, I have never seen a formal proof of the consistency of $\mathsf{PA}$, even though it is universally agreed by logicians and set theorists that it is a consequence of the $\mathsf{ZFC}$ axioms.

So my question is: why are we confident that "ordinary proofs" can be converted into formal proofs in $\mathsf{ZFC}$ when so few actually are? In other words, why are we confident that the only thing stopping us from doing this conversion in practice is the sheer tediousness of it all?

Best Answer

As Noah Schweber points out, results like the conservativity of expansions-by-definitions give us some confidence. But I think there's more to it than that. On the other hand, as always with informal-to-formal matters (e.g., Church's Thesis), we cannot expect to prove that formalization is always possible.

First a couple of analogies. In topology, one often runs into claims about continuity based on geometric intuition (e.g., the "house with two rooms", Hatcher p.4). No one doubts that these could be reformulated as $\epsilon$-$\delta$ proofs, but nobody would want to carry out the rewriting. In computability theory, one can tell that certain functions are computable without actually writing programs, let alone constructing Turing machines.

In these and similar cases, the process of formalization is routine without being algorithmic. Once you've gained experience, you know what tools to deploy, and what pitfalls to worry about. To quote from an MO answer to the question of formalizing Wiles' proof in Peano Arithmetic:

Experts can "smell" the intrusion of a strong axiom from a long way off; for example, the primary reason that the Robertson-Seymour graph minor theorem requires more than PA is that it invokes a highly sophisticated argument by induction at a crucial point. The proof of FLT doesn't give off this sort of odor anywhere. Of course, a smell test is not a proof, but the smart money is that if we run into difficulties trying to prove FLT in PA, it won't be because we need some strong arithmetic axiom.

We should not forget that both the predicate calculus and the ZFC axioms arose from research programs of formalizing mathematical reasoning. For the predicate calculus, the Completeness Theorem gives us assurance that nothing was "left out". For ZFC, we have no comparable formal result, but the intuition behind the Cumulative Hierarchy is very suggestive.

Drilling down a bit, what makes this formalization process seem routine? I think it's because it is itself a "drilling down". Let's look at your example, the proof in ZFC of Con(PA). If one had to formalize it, one would start by breaking it into parts. At the top-level, the proof consists of two parts:

  1. The Soundness Theorem: if a theory $T$ has a model whose domain is a set, then it is consistent.
  2. PA has such a model in ZF, namely the finite ordinals.

What are key issues in formally proving the Soundness Theorem? They split into syntax and semantics. For syntax, all the pieces are "finite combinatorial structures": proofs are sequences of formulas, formulas sequences of symbols, etc. We have a great deal of confidence that ZFC can handle such things without even breathing hard.

Semantics seems prima facie like a bigger deal. But here again we have a roadmap: Tarski's inductive definition of truth (aka satisfaction), a piece of ordinary mathematics, not particulary convoluted. General results of ZFC handle the rather vanilla inductions and recursive definitions. I had occasion to write up a more detailed treatment; in it, I made free use of the ability of ZF to handle trees formally. If that were challenged, one would drill down into the details with resignation, perhaps, but without trepidation.

In Kaye's Models of Peano Arithmetic, he dealt with the more difficult problem of formalizing certain notions of truth inside PA. He starts with the sentence, "This is the chapter that no one wanted to have to write..." Nonetheless, working out the details at no point required real inspiration, just persistence. (The one non-obvious trick, representing sequences of arbitrary length in PA, was provided by Gödel long ago.)

Ultimately, drilling down should reach bedrock: something formalizable in ZFC. Of course you can't prove that it will, short of doing it. But it would be quite surprising if it didn't.

All that said, surprises can occur. For example, a theory I call ZFC$\neg\infty$ -- that's ZFC with the axiom of infinity replaced with its negation -- is, according to folklore, "equivalent" to PA. But when Kaye and Wong looked into the details, they found that for a strong notion of equivalence, ZFC$\neg\infty$ must be supplemented with another axiom. (See here for more details. The Kaye-Wong paper is "On Interpretations of Arithmetic and Set Theory", Notre Dame Journal of Formal Logic, 2007.)

More examples. Much of modern category theory uses Grothendieck Universes; these are usually justified by appealing to a large cardinal axiom. Finally, as I understand it, it is still not settled if Wiles' proof can be formalized in PA. The answer to this question remarks, "Wiles' original proof of FLT uses set-theoretical assumptions unprovable in Zermelo-Fraenkel set theory with axiom of choice", but goes on to quote a paper saying "certainly much less than ZFC is used in principle, probably nothing beyond PA, and perhaps much less than that."

Related Question