[Math] Why doesn’t this definition of natural numbers hold up in axiomatic set theory

elementary-set-theoryfoundations

I was reading about older definitions of the natural numbers on Wikipedia here (in retrospect, not the best place to learn mathematics) and came across the following definition for the natural numbers: (paraphrased)

Let $\sigma$ be a function such that for every set $A$, $\sigma(A) := \{ x \cup \{ y \} \mid x \in A \wedge y \notin x \} $. Then $\sigma(A)$ is the set obtained by adding any new element to all elements of $A$. Then define $0 := \{ \emptyset \}$, $1 := \sigma(0)$, $2 := \sigma(1)$ et cetera.

The way I understood this definition is that the natural number $n$ is "defined" as the set of all sets with exactly $n$ elements. This sounded straightforward to me, until I read the next paragraph:

This definition works in naive set theory, type theory, and in set theories that grew out of type theory, such as New Foundations and related systems. But it does not work in the axiomatic set theory ZFC and related systems, because in such systems the equivalence classes under equinumerosity are "too large" to be sets. For that matter, there is no universal set V in ZFC, under pain of the Russell paradox.

Why exactly doesn't this definition work in ZFC? I don't fully understand how the sets in this definition are "too large". Is part of the problem just that there is no "universal set" to pick the element $y$ from?

I tried to do some more reading to find my answer, but the material was way out of my depth. (I am only familiar with the basics of set theory, Russell Paradox, Cantor diagonal argument, and not much more. ) So I apologize in advance if this is a really simple question…

Best Answer

"Too large" is an informal description of what goes wrong, and is not entirely on point for understanding how ZFC doesn't allow you to do this.

It would be more honest to describe the problem as

There is simply no axiom in ZFC that will let you conclude that the notation $\{ x \cup \{y\} \mid x\in a, y\notin x\}$ describes any set that exists.

Remember that ZFC doesn't support free-wheeling use of the set builder noation which assumes that $\{y\mid \phi(y) \}$ (where $\phi$ is some logical formula) always describes a set. Instead you have only separation which tells you that expressions of the form $\{y\in A\mid \phi(y)\}$ describe sets, and replacement which tells you that expressions of the form $\{F(y)\mid y\in A\}$ -- where $F$ is some function that you can define by a logical formula -- are sets.

However, $\{ x \cup \{y\} \mid x\in a, y\notin x\}$ doesn't have this form -- instead it would fit the scheme $\{ F(y) \mid \phi(y) \}$, and neither Separation nor Replacement promises to work for that situation.

(If you haven't seen the axioms of ZFC written down, it would probably help your understanding to seek out an explanation of them. In particular, what I describe as, for example, "$\{y\in A\mid \phi(y)\}$ exists" is formally described by saying that for any formula $\phi(y)$ that doesn't contain $x$, the formula $$ \exists x.\forall y.(y\in x \iff y\in A \land \phi(y)) $$ is an axiom).


The "too large to be a set" is at most a hint at an answer to a different question, namely

  • Why can't we just have some more axioms that say we can do this?

The answer to this is that we can actually prove that $\{x\cup \{y\}\mid x\in A, y\notin x\}$ does not exist (which is different from not being able to prove that it does) -- so if we had an axiom that claimed that it did exist, the system would become inconsistent.

In more detail, the proof might go: Suppose that for some set $A$, $$\sigma(A) = \{x\cup \{y\}\mid x\in A, y\notin x\}$$ exists. Then $\bigcup A \cup \bigcup\sigma(A)$ -- which must exist due to ZFCs explicit Axiom of Unions -- would be a set that contains all elements of $A$, as well as all elements of elements of $A$, as well as every set that is in neither of these groups. In other words, this would be a set of all sets, and then Russell's paradox would lead us to a contradiction.

Presenting only this second argument, without explaining (or stressing) the first one, is a common failing of semi-popular descriptions of set theory. It can easily give a reader the impression that something must be allowed unless we can see it leads to a paradox, which is most definitely not how axiomatic set theory works. Axiomatic set theory works by saying from the beginning, "these are the things that are allowed" and then hoping no combination of those things lead to a paradox.

The only real value of the "if you could do this, it would lead to a paradox" argument is that once you see it you can stop trying to figure out a way that it is allowed.

Related Question