[Math] Succinctly naming big numbers: ZFC versus Busy-Beaver

computability-theorylo.logicset-theory

Years ago, I wrote an essay called Who Can Name the Bigger Number?, which posed the following challenge:

    You have fifteen seconds. Using standard math notation, English words, or both, name a single whole number—not an infinity—on a blank index card. Be precise enough for any reasonable modern mathematician to determine exactly what number you’ve named, by consulting only your card and, if necessary, the published literature.

The essay went on to discuss systems for naming increasingly huge numbers concisely—including the Ackermann function, the Busy Beaver function, and super-recursive generalizations of Busy Beaver.

Recently (via Eliezer Yudkowsky), the claim has come to my attention that there are ways to concisely define vastly bigger numbers than even the super-recursive Busy Beaver numbers, using set theory. (See for example this page by Agustín Rayo, which proposes a definition based on second-order set theory.) However, whether these definitions work or not seems to hinge on some very delicate issues about definability in set theory.

So, I have a specific question about fast-growing integer sequences that are "well-defined," as I understand the term. But first, let me be clear about some ground rules: I'm certainly fine with integer sequences whose values are unprovable from (say) the axioms of ZFC, as sufficiently large Busy Beaver numbers are. Crucially, though, the values of the sequence must not depend on any controversial beliefs about transfinite sets. So for example, the "definition"

    n := 1 if CH is true, 2 if CH is false

makes sense in the language of ZFC, but it wouldn't be acceptable for my purposes. Even a formalist—someone who sees CH, AC, large-cardinal axioms, etc. as having no definite truth-values—should be able to agree that we've picked out a specific positive integer.


Let me now describe the biggest numbers I know how to name, consistent with the above rules, and then maybe you can blow my numbers out of the water.

Given a Turing machine M, let S(M) be the number of steps made by M on an initially blank tape, or 0 if M runs forever. Then recall that BB(n), or the nth Busy Beaver number, is defined as the maximum of S(M) over all n-state Turing machines M. BB(n) is easily seen to grow faster than any computable function. But for our purposes, BB(n) is puny! So let's define $BB_1(n)$ to be the analogue of BB(n), for Turing machines equipped with an oracle that computes $BB_0(n):=BB(n)$. Likewise, for all positive integers k, let $BB_k$ be the Busy Beaver function for Turing machines that are equipped with an oracle for $BB_{k-1}$. It's not hard to see that $BB_k$ grows faster than any function computable with a $BB_{k-1}$ oracle.

But we can go further: let $BB_{\omega}$ be the Busy Beaver function for Turing machines equipped with an oracle that computes $BB_k(n)$ given (k,n) as input. Then let $BB_{\omega+1}$ be the Busy Beaver function for Turing machines with an oracle for $BB_{\omega}$, and so on. It's clear that we can continue in this way through all the computable ordinals — i.e. those countable ordinals $\alpha$ for which there exists a way to describe any $\beta < \alpha$ using a finite number of symbols, together with a Turing machine that decides whether $\beta < \beta'$ given the descriptions of each.

Next, let $\alpha(n)$ be the largest computable ordinal that can defined (in the sense above) by a Turing machine with at most n states. Then we can define

$f(n) := BB_{\alpha(n)}(n),$

which grows faster than $BB_{\alpha}$ for any fixed computable ordinal $\alpha$.


A different way to define huge numbers is the following. Given a predicate $\phi$ in the language of ZFC, with one free variable, say that $\phi$ "defines" a positive integer m if m is the unique positive integer that satisfies $\phi$, and the value of $m$ is the same in all models of ZFC.

Then let z(n) be the largest number defined by any predicate with n symbols or fewer.

One question that immediately arises is the relationship between f(n) and z(n). I don't think it's hard to show that there exists a constant c such that $f(n) < z(n+c)$ for all n (please correct me if I'm wrong!) But what about the other direction? Does z(n) grow faster than any function definable in terms of Turing machines, or can we find a function similar to f(n) that dominates z(n)? And are there other ways of specifying big numbers that dominate them both?


Update (8/5): After reading the first few comments, it occurred to me that the motivation for this question might not make sense to you, if you don't recognize a distinction between those mathematical questions that are "ultimately about finite processes" (for example: whether a given Turing machine halts or doesn't halt; the values of the super-recursive Busy Beaver numbers; most other mathematical questions), and those that aren't (for example: CH, AC, the existence of large cardinals). The former I regard as having a definite answer, independently of the answer's provability in any formal system such as ZFC. (If you doubt that there's a fact of the matter about whether a given Turing machine halts or runs forever, then you might as well also doubt that there's a fact of the matter about whether a given statement is or isn't provable in ZFC!) For questions like CH and AC, by contrast, one can debate whether it even means anything to discuss their truth independently of their provability in some formal system.

In this question, I'm asking about integer sequences that are "ultimately definable in terms of finite processes," and which one can therefore regard as taking definite values, independently of one's beliefs about set-theoretic questions. Of course, "ultimately definable in terms of finite processes" is a vague term. But one can list many statements that certainly satisfy the criterion (for example: anything expressible in terms of Turing machines and whether they halt), and others that certainly don't (for example: CH and AC). A large part of what I'm asking here is just how far the boundaries of the "definable in terms of finite processes" extend!

Yes, it's possible that my question could degenerate into philosophical disagreements. But a priori, it's also possible that someone can give a sequence that everyone agrees is "definable in terms of finite processes," and that blows my f(n) and z(n) out of the water. The latter would constitute a completely satisfying answer to the question.


Update (8/6): It's now been demonstrated to my satisfaction that z (as I defined it) is blown out of the water by f. The reason is that z is defined by quantifying over all models of ZFC. But by the Completeness Theorem, this means that z can also be defined "syntactically," in terms of provability in ZFC. In particular, we can compute z using an oracle for the $BB_1$ function (or possibly even the BB function?), by defining a Turing machine that enumerates all positive integers m as well as all ZFC-proofs that the predicate $\phi$ picks out m.

So thanks — I didn't want to prejudice things, but this is actually the answer I was hoping for! If it wasn't clear already, I'm interested in big numbers not so much for their own sake, but as a concrete way of comparing the expressive power of different notational systems. And I have a strong intuition that Turing machines are a "maximally expressive" notational system, at least for those numbers that meet my criterion of being "ultimately defined in terms of finite processes" (so in particular, independent of the truth or falsehood of statements like CH). If one could use ZFC to define integer sequences that blew my sequence f(n) out of the water (and that did so in a model-independent way), that would be a serious challenge to my intuition.

So let me refocus the question: is my intuition correct, or is there some more clever way to use ZFC to define an integer sequence that blows f(n) out of the water?

Actually, a proposal for using ZFC to at least match the growth rate of f now occurs to me. Recall that we defined the sequence z by maximizing over all models M of ZFC. However, this definition ran into problems, related to the "self-hating models" that contain nonstandard integers encoding proofs of Not(Con(ZFC)). So instead, given a model M of ZFC and a positive integer k, let's call M "k-true" if every $\Pi_k$ arithmetical sentence S is true in M if and only if S is semantically true (i.e., true for the standard integers). (Here a $\Pi_k$ arithmetical sentence means a sentence with k alternating quantifiers, all of which range only over integers.)

Now, let's define the function

$z_k(n)$

exactly the same way as z(n), except that now we only take the maximum over those models M of ZFC that are k-true.

This remains to be proved, but my guess is that $z_k(n)$ should grow more-or-less like $BB_{k+c}(n)$, for some constant c. Then, to get faster-growing sequences, one could strengthen the k-truth requirement, to require the models of ZFC being maximized over to agree with what's semantically true, even for sentences about integers that are defined using various computable ordinals. But by these sorts of devices, it seems clear that one can match f but not blow it out of the water—and indeed, it seems simpler just to forget ZFC and talk directly about Turing machines.

Best Answer

I think your question is not as precise as you portray it.

First, let me point out that you have not actually defined a function $z$, in the sense of giving a first order definition of it in set theory, and you provably cannot do so, because of Tarski's theorem on the non-definability of truth. We simply have no way to express the relation x is definable in the usual first-order language of set theory. More specifically:

Theorem. If ZFC is consistent, then there are models of ZFC in which the collection of definable natural numbers is not a set or even a class.

Proof. If V is a model of ZFC, then let $M$ be an internal ultrapower of $V$ by a nonprincipal ultrafilter on $\omega$. Thus, the natural numbers of $M$ are nonstandard relative to $V$. The definable elements of $M$ are all contained within the range of the ultrapower map, which in the natural numbers is a bounded part of the natural numbers of $M$. Thus, $M$ cannot have this collection of objects as a set or class, since it would reveal to $M$ that its natural numbers are ill-founded, contradicting that $M$ satisfies ZFC. QED

In such a model, your definition of $z$ is not first order. It could make sense to treat your function $z$, however, in a model as an externally defined function, defined outside the model (as via second-order logic). In this case, $z(n)$ only involves standard or meta-theoretic definitions, and other problems arise.

Theorem. If ZFC is consistent, then there is a model of ZFC in which $z(n)$ is bounded by a constant function.

Proof. If ZFC is consistent, then so is $ZFC+\neg Con(ZFC)$. Let $V$ be any model of this theory, so that there are no models of ZFC there, and the second part of the definition of $z$ becomes vacuous, so it reduces to its definable-in-$V$ first part. Let $M$ be an internal ultrapower of $V$ by an ultrafilter on $\omega$. Thus, $M$ is nonstandard relative to $V$. But the function $z$, defined externally, uses only standard definitions, and the definable elements of $M$ all lie in the range of the ultrapower map. If $N$ is any $V$-nonstandard element of $M$, then every definable element of $M$ is below $N$, and so $z(n)\lt N$ for every $n$ in $M$. QED

Theorem. If ZFC is consistent, then there is a model of ZFC in which $f(n)\lt z(10000)$ for every natural number n in the meta-theory.

Proof. If ZFC is consistent, then so is $ZFC+\neg Con(ZFC)+GCH$. Let $V$ be a countable model of $ZFC+\neg Con(ZFC)+GCH$. Since $V$ has no models of ZFC, again the second part of your definition is vacuous, and it reduces just to the definability-in-$V$ part. Let $M$ again be an internal ultrapower of $V$ by an ultrafilter on $\omega$, and let $N$ be a $V$-nonstandard natural number of $M$. Every definable element of $M$ is in the range of the ultrapower map, and therefore below $N$. In particular, for every meta-theoretic natural number $n$, we have $f(n)\lt N$ in $M$, since $f(n)$ is definable. Now, let $M[G]$ be a forcing extension in which the continuum has size $\aleph_N^M$. Thus, $N$ is definable in $M[G]$ by a relatively short formula; let's say 10000 symbols (but I didn't count). Since the forcing does not affect the existence of ZFC models or Turing computations between $M$ and $M[G]$, it follows that $f(n)\lt z(10000)$ in $M[G]$ for any natural number of $V$. QED

Theorem. If ZFC is consistent, then there is a model of ZFC with a natural number constant $c$ in which $z(n)\lt f(c)$ for all meta-theoretic natural numbers $n$.

Proof. Use the model $M$ (or $M[G]$) as above. This time, let $c$ be any $V$-nonstandard natural number of $M$. Since the definable elements of $M$ all lie in the range of the ultrapower map, it follows that every z(n), for meta-theoretic $n$, is included in the $V$-standard elements of $M$, which are all less than $c$. But $M$ easily has $c\leq f(c)$, and so $z(n)\lt f(c)$ for all these $n$. QED