Peano Arithmetic – Stacked Quantifiers and Ordinals up to $\epsilon_0$

lo.logicordinal-analysispeano-arithmetic

I am trying to understand why induction up to exactly $\epsilon_0$ is necessary to prove the cut-elimination theorem for first-order Peano Arithmetic; or, as I understand, equivalently, why the length of a PA-proof with all cuts eliminated grows (in the worst case) as fast as $f_{\epsilon_0}$ in the fast-growing hierarchy.

I can understand why use of an induction axiom corresponds to ordinal multiplication by $\omega$. As induction axioms are written in the sequent calculus, as a premise we have a proof of $\phi(x)\vdash\phi(Sx)$ for a free variable $x$, and as a conclusion we get $\phi(0)\vdash\phi(y)$ for any term $y$.

Then if the proof of $\phi(x)\vdash\phi(Sx)$ has the ordinal $\omega^\alpha$, the proof with the conclusion $\phi(0)\vdash\phi(y)$ is assigned the ordinal $\omega^{\alpha+1}$. E.g. if the proof of the induction step had ordinal $\omega^0 = 1$, then the conclusion $\phi(y)$ has the ordinal $\omega^1 = \omega$.

This part makes perfect sense to me, because if I wanted to eliminate the use of an induction rule (CJ sequent) I might just start with $\phi(0)$ and repeatedly go through the part of the proof that I used to prove $\phi(x)\vdash\phi(Sx)$. Say, I'd repeat the sub-proof of the induction step 17 times, e.g. with $\phi(14)\vdash\phi(15)$ and so on, until I got to $\phi(17)$, or whichever number I wanted to prove $\phi$ about.

I don't quite understand how this corresponds to eliminating cuts, per se (it seems to create a host of new cuts, in fact). But it is still very intuitive to me that Peano Arithmetic's power to invoke the induction axiom would correspond to multiplication by $\omega$. If we take a proof in PA that ends with a single use of the induction axiom to get $\phi(y)$, and we want to translate it into a proof in an arithmetic theory that doesn't have induction, then we might have to multiply the PA-proof-length by any finite number (e.g. multiply it by 17) to get the proof length in the inductionless theory.

Repeated multiplication by $\omega$ only takes us up to $\omega^\omega$ as a proof-theoretic ordinal, though.

According to Gentzen, when we use cuts on formulas involving quantifiers, e.g. $\forall p:\exists q:\psi(p,q)$, this corresponds to ordinal exponentiation. If the proof above the cut has ordinal $\alpha$ and we cut a formula with one quantifier, then the proof below the cut has ordinal $\omega^\alpha$. If we cut on two quantifiers, the proof below the cut has ordinal $\omega^{\omega^\alpha}$. This fits with other things I've heard about how PA using formulas with only N quantifiers can be proven consistent by PA using only N+1 quantifiers. (As a side issue I'd be interested in knowing how you use N quantifiers to prove wellfoundedness of an ordinal notation for a stack of $\omega$s N layers high.)

What I don't understand is why the ability to use quantifiers corresponds to ordinal exponentiation. I can guess in a vague sense that if we have a proof using $\forall p:\phi(p)\vdash\phi(17)$, and we want to eliminate the use of $\forall p:\phi(p)$, then we need to repeat that part of the proof each time we want to prove $\phi(17)$, $\phi(q)$, and so on. But this would again just imply that we needed to repeat that part of a proof a finite number of times, and iterating this takes us up to only $\omega^\omega$, not the desired $\epsilon_0$, so I must be missing something.

I asked a friend to help with this and she's read through Gentzen's relevant papers, and she's shown me the relevant parts, and I'd previously checked several standard texts on proof theory and Googled around, and she's also shopped the question around the logic department of a major university, and we still don't know any answer to this except "because Gentzen says to use ordinal exponentiation".

We also can't find any examples of cut-elimination being carried out on a proof with cut on a quantified formula, which shows how the size of the resulting cut-free proof could grow faster than $f_{\omega^\omega}$. An example like this for some particular proof would be very helpful, even if, of necessity, the repeated steps for eliminating the cut are sketched more than shown. I can understand why the length of a Kirby-Paris hydra game grows at the same rate as the Goodstein sequence, and visualize both processes insofar as a human being possibly can. I cannot visualize why the length of a PA-proof heading for cut-freeness would grow at that same rate as cuts were repeatedly eliminated. (Mapping the process onto a Kirby-Paris hydra game of starting height at least 3 would answer the question!)

Best Answer

8/14: Substantially edited in response to comments: added to 1st part, added new 2nd and 4th parts

There's also some discussion underneath, and a link to a partial write-up of a case of cut-elimination involving the Ackermann function

If you haven't found the Gentzen-style proof illustrative, I recommend trying the infinitary proof. (Not everyone likes it, but it's at least an alternate perspective on it.) Pohlers "Proof Theory: An Introduction" gives a nice presentation of the proof, and it includes the proof in PA that ordinals below $\epsilon_0$ are well-founded. (This includes that fact that you use $N$-quantifier induction to proof well-foundedness up to $N+1$ exponents, and provides some insight into way additional quantifiers make it have the effect they do.)

One nice feature of the infinitary proof is that the ordinals are outright bounds on the heights of proofs, so if you're comfortable visualizing infinitary proofs, the source of the increase in bounds is quite explicit.

It's worth noting that "infinitary" is a bit of a misnomer. The infinitary proof takes the perspective that a proof of $\forall x\phi(x)$ should be a computable function $f$ so that, for each $n$, $f(n)$ is a proof of $\phi(n)$. Since the functions are all computable, there's nothing genuinely infinitary about it. People usually ignore the "computable" part for expository purposes, since it makes sense without that requirement, at which point it does look infinitary, but this is just because many of the ideas are clearer without constantly rechecking that the operations we're describing really are computable.

$ $

A follow-up point about what ordinals mean in proof theory. As pointed out in the comments, $\omega$ doesn't really mean "infinite" it means "an unspecified fixed integer". Similarly, $\omega+3$ really means "not only is this integer unspecified, but it will take three steps to figure out which integer it is". So in the infinitary proof of cut-elimination, when we say that a proof has height $\omega$, we mean that it is a proof of, say, $\forall x\phi(x)$, where the height of proof of $\phi(n)$ is $g(n)$ with $g(n)\rightarrow\infty$.

A proof of height $\omega+3$ is a proof with three additional steps below a proof of height $\omega$. More interestingly, a proof of height $\omega+\omega$ might be a proof of $\forall x\phi(x)\vee\forall y\psi(y)$ where for each $n$, we have a proof of $\phi(n)\vee\forall y\psi(y)$ of height $\omega+n$. To find a quantifier-free instantiation of this proof, we'd have to first plug in an $n$ for $x$ and then, at the appropriate level, an $m$ for $y$ to get a proof of $\phi(n)\vee\psi(m)$ of height $g(n)+h(n,m)$.

$ $

Speaking very loosely, the intuitions you describe above are perfectly sensible, but they basically only correspond to what's happening with fairly simple (mostly one quantifier) formulas. Cut-elimination corresponds to extracting computable information, and when you cut formulas with multiple quantifiers together, information has to flow back and forth between the two sides.

Informally, I think of it like this: suppose I want to cut a proof of $$(\forall x\exists y\phi)\vee\psi$$ with a proof of $$(\exists x\forall y\neg\phi)\vee\psi'$$ (cut-elimination is usually easier to think of in a one-sided form). The second proof produces a first proposal for a value of $x$ witnessing $\forall y\neg\phi$. The first proof might refute it by exhibiting a $y$. The second proof can now use that value of $y$ to compute a second guess at what $x$ is, the first proof refutes it again, and so on back and forth. This could go back and forth a large number of times (on the order of the ordinal of the second proof---that is, not just a fixed number of times, but, if the ordinal is something like $\omega+1$, a number of times based on the value of the first $y$, or if it's $\omega+\omega+1$, the first $y$ tells us how many steps we go before a value of $y$ which tells us how many steps we go, etc.)

This description is literally true when viewed through the lens of the functional interpretation. When viewed in terms of infinitary cut-elimination, these "back and forths" correspond to interleaved cuts: we place many copies of the first proof (more precisely, proofs of $\exists y\phi(n,y)\vee\psi$ for various values of $n$) throughout the second proof. Each of these creates new cuts, which correspond to taking sections of the second proof and placing them inside the many copies of the first proof which we've just created.

$ $

I think your indexing on the relationship between the fast growing hierarchy and proof-theoretic ordinals is off by an exponential. The proof theoretic ordinal of $I\Sigma_1$ is $\omega^\omega$, but it only proves the primitive recursive functions total (i.e. $f_n$ with $n<\omega$ in the fast-growing hierarchy). I'm not finding a reference on the exact relationship, but I would guess that the gap is consistent---that in the way two quantifier induction is like $\omega^{\omega^n}$, the Conway chain notation is like $\omega^{\omega^2}$. So, while two quantifier induction suffices, dealing with Conway chain notation will look like a substantial use of two-quantifier induction (indeed, I think it involves three levels of nested induction, two of which are over two-quantifier formulas, which would line up well with an ordinal of $\omega^{\omega^2}$).

Having said that, your question is still perfectly sensible: how does the back and forth I described above get things that are very fast growing. Suppose the first proof, of $(\forall x\exists y\phi)\vee\psi$, has a sensible rate of growth, say, exponential. The second proof is iterating the rate of growth of the first proof, and can iterate it an ordinal number of times based on the height of the second proof. Since the second proof could easily (given perhaps some additional cuts over smaller formulas) have, say, height $\omega^2$. That is, if we take $f_0(n)=n^n$, $f_{\alpha+1}(n)=f_\alpha(f_\alpha(n))$, and $f_\lambda(n)=f_{\lambda[n]}(n)$, we should hit around level $f_{\omega^2}(n)$ in this hierarchy.

Related Question