Why does the Cantor-Bendixson cupcake theorem need transfinite induction

descriptive-set-theorygeneral-topologyreverse-math

Recall the Cantor-Bendixson theorem:

Let $X$ be a Polish space. For every closed subset $K \subseteq X$, there is a unique disjoint sum decomposition $C \cup P = K$ where $C$ is countable and $P$ is perfect. Moreover, $P$ is exactly the condensation set of $K$.

I know two proofs of the above theorem. One is to define the Cantor-Bendixson derivative $K \mapsto K'$ by letting $K'$ be the accumulation set of $K$, and define $K^{(\alpha)}$ by transfinite recursion on the Cantor-Bendixson derivative, taking intersections at limit stages. One only loses countably many points at each stage, and this process must terminate at some ordinal $\alpha_K < \aleph_1$, the "Cantor-Bendixson rank" of $K$. So $P = K^{(\alpha_K)}$; then $C$ must be countable by the above, and one easily checks that $P$ is the condensation set of $K$.

Another apparently does not use transfinite recursion at all. Let $P$ be the condensation set of $K$. Since $K$ is second countable, it has a countable basis $U_n$, and $C = \bigcup_{n: |U_n| < \aleph_1} U_n$, hence $C$ is countable.

Famously, the Cantor-Bendixson theorem (where $X = \mathbb R$) cannot be proven without appeal to $\Pi_1^1$-comprehension. This is proven in reverse mathematics. In particular, $\Pi_1^1$-comprehension is stronger than arithmetical transfinite recursion ($ATR_0$). So one expects that "transfinite induction is avoidable"; most famous theorems that require $ATR_0$ or a stronger system make some sort of appeal to the transfinite.

Why is the Cantor-Bendixson theorem (for $X = \mathbb R$) so ridiculously strong? Is the notion of condensation set the problematic step, since it requires us to be able to say "$x \in P$ iff for every $n > 0$, the set $A(x, n) = \{y \in K: |x – y| < 1/n\}$ is not the image of any sequence"? (Since the elements of the "set" $A(x, n)$ are really sets of natural numbers, I can see this being problematic exactly as an artifact of the fact that we are using Goedel codes!) Or is the sneaky appeal to countable choice in the proof that $C$ is countable the reason why we need $\Pi_1^1$-comprehension? Neither seems much stronger than arguments one makes in soft analysis all the time, so is the problem really that those other theorems of soft analysis haven't been studied so much in reverse mathematics, and the Cantor-Bendixson theorem is just infamous because logicians have thought to analyse it?

Best Answer

This is a great question! While the reverse mathematics of simple combinatorial or algebraic principles feels rather concrete, once we look at stronger principles - especially those connected with "higher-order" notions - rather deep subtleties enter the picture (essentially the fragrant scent of descriptive set theory).

My answer has three parts. First, I'll address the more general part of your question. Then I'll turn to the analysis of the specific theorem mentioned, and along the way introduce a general heuristic. Finally, I'll discuss a result which at first glance appears to be a counterexample to that heuristic, but which I'll argue actually conforms to it when properly analyzed.


First off, let me address your question about "soft analysis" in reverse mathematics. Classical reverse mathematics works exclusively in the framework of second-order arithmetic (basically: reals and naturals only), and as such many natural results in analysis can't even be studied directly - and even ones which can be require careful phrasing (see below for the Cantor-Bendixson theorem in particular).

That said, more recently the framework of higher-order reverse mathematics has been investigated. The original paper (ignoring some pre-history by Friedman and Harnik) is by Kohlenbach, with more recently some interesting work in analysis being performed by Sanders and Normann (e.g. here), and I also contributed to the topic (see the second-to-last paragraph of this answer). Probably the Sanders-Normann work is most relevant here: they indeed show that a lot of analysis has surprisingly power from a reverse-mathematical standpoint.


Having addressed the softer aspect of your question, let me turn to the specific example you consider:

Why should we expect the Cantor-Bendixson theorem to be at the level of $\Pi^1_1$-$\mathsf{CA_0}$?

The first hint that the Cantor-Bendixson theorem might be surprisingly powerful actually comes from a quick analysis of the second (ordinals-free) proof. Specifically, the point is that determining whether a set is countable involves an existential query over maps from that set to $\omega$. This looks "naively second-order," and so $\Pi^1_1$-$\mathsf{CA_0}$ is a decent naive guess. And this stands up to scrutiny when - per the above - we more carefully formulate the theorem within the right language as follows. First, we pin down the relevant concepts:

  • "Closed set" = subtree of $\omega^{<\omega}$ (with points thought of as paths through the tree).

  • Countable sets are represented by $\omega\times\omega$-arrays of natural numbers, with each element of the former being a "row" in the latter.

We can now precisely state the theorem in our more restrictive framework:

  • Cantor-Bendixson theorem ($\mathsf{CBT}$): For every tree $K\subseteq \omega^{<\omega}$, there is a tree $P\subseteq\omega^{<\omega}$ and a countable set $C\in \omega^{\omega\times\omega}$ such that

    • $P\subseteq K$,

    • $P$ is perfect (= $P$ has no dead ends and each node on $P$ has a splitting extension),

    • Every "row" of $C$ is a path through $K$, and

    • Every path through $K$ is either a subset of $P$ or a "row" of $C$.

The first through third conditions are arithmetic. The fourth condition, however, is $\Pi^1_1$, and so the whole construction $K\mapsto (P,C)$ "looks $\Pi^1_1$." As such we should expect it to require $\Pi^1_1$-comprehension. A quick look at the proof shows that $\Pi^1_1$-$\mathsf{CA_0}$ does indeed work as an upper bound.

Now why should we expect that natural upper bound to be sharp?

From a reverse-mathematical standpoint, this is where ordinals come in: often transfinite recursion can be used to improve (in terms of logical complexity, not length or clarity) an argument based on higher-type objects. More concretely, some results which have a naive upper bound of $\Pi^1_1$-$\mathsf{CA_0}$ can in fact be proved in $\mathsf{ATR_0}$ or similar. As such, what we'll do is "ordinalify" the argument above, and then observe that it still falls into the "$\Pi^1_1$-like" category, thus providing further evidence in favor of sharpness (and ultimately a useful hint towards the proof).

Of course this takes for granted that $\mathsf{ATR_0}$ is in fact weaker than $\Pi^1_1$-$\mathsf{CA_0}$. This is definitely nontrivial. In particular, coming from computability one might naively expect an argument that $\mathcal{O}$ is a member of every $\omega$-model of $\Pi^1_1$-$CA_0$ but not every $\omega$-model of $\mathsf{ATR_0}$; however, while the latter claim is true the former claim can be disproved via Gandy's basis theorem. A correct proof can be found in Simpson's book if you're interested.

The key observation is that there is a precise relationship between $\Pi^1_1$ sets and transfinite recursion. Since well-foundedness is $\Pi^1_1$-complete, a $\Pi^1_1$ (description of a) set of reals $A$ corresponds to a continuous map sending a real $r$ to a tree $T_r$ such that $T_r$ is well-founded iff $r\in A$. We have a notion of rank for trees, with the rank of a tree being a countable ordinal iff that tree is well-founded (and $\infty$ otherwise). This lets us build $A$ in "layers" as $$A=\bigcup_{\alpha<\omega_1}\{r: rk(T_r)<\alpha\}.$$ Computing the $\alpha$th "layer" of $A$ essentially only requires "recursion along $\alpha$."

This leads to an informal trichotomy between $\Pi^1_1$ constructions:

  • Long recursion. At each countable ordinal we add new things to $A$.

  • Short recursion. There is some countable ordinal $\eta<\omega_1$ by which the enumeration of $A$ stops, and moreover $\eta$ is "concretely given" by (the description of) $A$ itself.

  • Medium recursion. While the enumeration of $A$ does stop before $\omega_1$, there's no obvious way to whip up a bound on the stopping time just from the description of $A$ alone.

For example, the Cantor-Bendixson-derivative argument is clearly not a long recursion, since the whole point is that it stops long before $\omega_1$. However, the proof of this fact is rather nonconstructive, and so after some thought we should categorize it as a medium recursion.

By contrast, consider the principle $\mathsf{CWO}\equiv$ "Any two well-orderings are comparable." The "naive" proof of this is again $\Pi^1_1$-flavored since we quantify over possible embeddings. However, when we look more closely at it we see that we can in fact construct the desired comparison map between well-orderings $\alpha$ and $\beta$ via a recursion of length more-or-less $\min\{\alpha,\beta\}$ . So this is an example of a short recursion, and indeed it turns out that $\mathsf{CWO}$ is equivalent to $\mathsf{ATR_0}$.

And it turns out that this is a pretty decent rule of thumb:

"Length-to-strength" thesis: short recursions generally work in $\mathsf{ATR_0}$, while medium recursions generally require $\Pi^1_1$-$\mathsf{CA_0}$.

(Long recursions are a whole other deal, and best ignored for now.)

So basically the intuition behind "$\mathsf{CBT}\equiv \Pi^1_1$-$\mathsf{CA_0}$" is the following:

  • The naive argument requires $\Pi^1_1$-$\mathsf{CA_0}$.

  • When we dive into the ordinal hierarchy underlying the relevant $\Pi^1_1$ set (that is, when we look at the Cantor-Bendixson argument), we seem to get only a medium recursion.

  • So $\Pi^1_1$-$\mathsf{CA_0}$ is probably sharp.


There's an important near-counterexample to the "length-to-strength" thesis above; since it's instructive and also dear to my heart (it drove my very first paper), I'll end this answer by saying a bit about it.

The near counterexample is the equivalence between clopen and open determinacy. As with $\mathsf{CWO}$, the "naive" higher-type proof of clopen determinacy ("color each game state blue iff player $1$ has a winning strategy from that point on and red otherwise, and look at the color of the root") is refined into a short recursion (basically, bounded by the length of the Kleene-Brouwer order of the game tree). As such, it's unsurprisingly equivalent to $\mathsf{ATR_0}$.

At a glance, however, the similar refinement of the proof of open determinacy seems to be a medium recursion since some game states never get "ranked." So we would expect open determinacy to be equivalent to $\Pi^1_1$-$\mathsf{CA_0}$, or at least strictly more complicated than $\mathsf{ATR_0}$. This is further bolstered by the fact that every computable clopen game has a hyperarithmetic winning strategy but there are computable open games without hyperarithmetic winning strategies.

However, it turns out that in fact clopen and open determinacy are equivalent. The usual proof of this is rather technical and goes through the technical principle $\Sigma^1_1$-$\mathsf{Sep}$; here's a sketch of a simpler proof (which I think is original):

Suppose $G$ is an open game with no winning strategy for player Closed. Let $T$ be the tree of surviving strategies for Closed - that is, a node of length $n$ on $T$ consists of a set $\sigma$ of responses for Closed to the lexicographically-first-$n$-many sequences of possible moves for open in $G$ such that $\sigma$ doesn't yet lose. Since $G$ has no winning strategy for Closed we have that $T$ is well-founded. Let $L$ be some ordinal "sufficiently greater" than the rank of $T$, and consider the variant $G_{L}$ of $G$ where in addition Open has to play a decreasing sequence of points in $L$. Since $T$ is well-founded, $G_{L}$ happens to be clopen: eventually Open runs out of elements of $L$ to play.

Now apply clopen determinacy to $G_{L}$ to get a winning strategy $\Sigma$ for one player or the other. A winning strategy for Open in $G_{L}$ yields a fortiori a winning strategy for Open in $G$, so we may WLOG assume that $\Sigma$ is a Closed-winning strategy. But $\Sigma$ yields a fortiori, not quite a winning strategy for Closed in $G$, but a subtree of $T$ of rank ... in the vicinity of $L$, which exceeds the rank of $T$. Whoops.

So what? Well, note that the key "length parameter" $L$ was constructed pretty explicitly from the itself-explicitly-constructed tree $T$. This has the effect that we can frame the entire argument as a short recursion! So the short=$\mathsf{ATR_0}$/medium=$\Pi^1_1$-$\mathsf{CA_0}$ idea isn't really going away, it's just more subtle here.

In fact, with some tweaking the above argument can be refined into uniform reduction. There are computable procedures $\mathbb{C}$ and $\mathbb{A}$ with the following property: the procedure $\mathbb{C}$ takes an open game $G$ and produces an open game $F$ such that for every $\Sigma$ which is either a witness to the non-clopen-ness of $F$ or a winning strategy for $F$, the procedure $\mathbb{A}$ turns $G\oplus \Sigma$ into a winning strategy in $G$. This is thematically related to results of Kihara/Marcone/Pauly, but as far as I know is not a direct consequence of them.

A seemingly-mundane point which is surprisingly crucial here is that the set of possible game states is no larger than the length of the game. When we drop this - e.g. by looking at length-$\omega$ games played on $\mathbb{R}$ - the equivalence between clopen and open determinacy breaks down! See me, Hachtman, or Sato. Interestingly, these three papers use fundamentally different techniques: forcing, fine structure theory, and proof theory, respectively. A natural further variant is to look at "long" games, but it turns out that this gets very thorny: for example, determinacy of games on $\{0,1\}$ of length $\omega_1$ is outright inconsistent with ZF, and more generally even going a bit beyond length $\omega$ leads to deep set-theoretic topics. So from the point of view of reverse mathematics, this seems a much less natural shift.

Related Question