[Math] Why use ZF over NFU

alternative-set-theoriesset-theory

Forgive me if this question is quite naïve; I've studied axiomatic set theory in the context of ZF, but my knowledge of NF(U) goes little beyond its axioms, what it means for a formula to be stratified, and stuff that I've read on websites here and there.

What makes NF appeal to me (more so than ZF) is that it uses fewer axioms and it resolves Russell's paradox in a way which better matches my intuition of how I think a 'set' should behave; certainly it's more intuitive than Foundation+Separation+Replacement+… in ZF. The downside of NF is that it proves $\neg$AC, which is a shame. But by adding urelements, which I can almost force myself to accept, we get NFU, which is:

  • Consistent;
  • Consistent with the Axiom of Choice;
  • Consistent with the Axiom of Infinity;
  • More intuitive than ZF;

And according to this page, NFU can "safely be extended as far as you think ZFC can be extended".

Now ZF(C) has its advantages, constructing new sets from old ones and whatnot, but it still hasn't been proved to be consistent. Wikipedia [citation needed] says: "A common argument against the use of NFU as a foundation for mathematics is that our reasons for relying on it have to do with our intuition that ZFC is correct." $-$ is that all there is to it?

My question is:

Why is ZF(C) the paradigm under which 'mathematics' is done, rather than NFU(+Inf)(+AC), which we know to be consistent?

Best Answer

I've never studied NF or NFU in any great detail, but I have found some points rather subtle and potentially not worth the effort to work around.

  1. Stratification. NF cannot escape its roots in Russellian type theory. This means that it is difficult to directly compare sets of different "types". For example, consider the "maps" $X \to \mathscr{P}(X)$ used in Cantor's theorem. This is in some sense illegal in NF: you can only have maps between sets of the same type, and the type of $\mathscr{P}(X)$ is one higher than that of $X$. The standard NF circumlocution here is to talk about maps $\mathscr{P}_1 (X) \to \mathscr{P} (X)$ instead, where $\mathscr{P}_1 (X)$ denotes the set of singleton subsets of $X$. Cantor's proof carries through without a problem once this workaround is adopted... but one has to realise that this is a different theorem. In particular, if $V$ is the universal set, we can only prove that there is no surjection $\mathscr{P}_1 (V) \to \mathscr{P}(V)$ – a far cry from the indubitable fact that $\mathscr{P}(V) \subseteq V$! (In fact, if you have even one urelement, then $\mathscr{P}(V) \subsetneq V$.)

  2. Failure of cartesian-closedness. Consider the evaluation "map" $\textrm{ev} : Y^X \times X \to Y$. The graph of this "map" cannot be a set in NF for various reasons. [1, 2] Roughly speaking, if $X$ is a set of type $n$, then $Y$ must also be a set of type $n$, in which case $Y^X \subseteq \mathscr{P}(X \times Y)$ is a set of type $n + 1$; but then we are unable to write down a stratified definition of $Y^X \times X$. Thus, the category of NF sets is not cartesian closed!

  3. Local types. What I have said so far appears to contradict the (finite) axiomatisation given by Holmes [3], which says e.g. that you can always form the set $X \times Y$ or $Y^X$. But this is only an apparent contradiction. The fact of the matter is that sets in NF do not actually have a type. I like to think of NF as being "locally typed": sets only gain types when they interact with other sets. Thus, $Y^X \times X$ exists (in some sense), but we cannot make much use of it. Holmes writes,

    Another fact about stratification restrictions should be noted: a variable free in $\phi$ in a set definition $\{ x \mid \phi \}$ may appear with more than one type without preventing the set from existing, as long as this set definition is not itself embedded in a further set definition. The reason for this is that such a set definition can be made stratified by distinguishing all free variables: the resulting definition is supposed to work for all assignments of values to those free variables, including those in which some of the free variables (even ones of different type) are identified with one another. For example, the set $\{ x, \{ y \} \}$ has a stratified definition; the set $\{ x, \{ x \} \}$ has an unstratified definition, but the existence of $\{ x, \{ y \} \}$ for all values of $x$ and $y$ ensures the existence of $\{ x, \{ x \} \}$ for any $x$. But a term $\{ x, \{ x \} \}$ cannot appear in the definition of a further set in which $x$ is bound.

  4. Weaknesses. Holmes [4] points out that bare NFU has the same consistency strength as PA, and NFU + Infinity + AC has the same consistency strength as Mac Lane set theory (MAC) (i.e. Zermelo set theory but with only $\Delta_0$-separation). It is true that in many cases we only need $\Delta_0$-separation to form the sets we want – but when induction seeps into the picture we have to start worrying.

    Consider the vector space $V$ of all real polynomials. There is no doubt that we can form the dual space $V^* = \textrm{Hom}_\mathbb{R} (V, \mathbb{R})$ in MAC – but Mathias [5] says that MAC cannot prove "the $n$-th iterated dual space exists for all natural numbers $n$". This is because the cardinalities of the iterated dual space grow too fast – after all, $V_{\omega + \omega}$ is a model of MAC. This seems contrived, but it is enough for me to start worrying about whether important inductive constructions can be carried out in MAC. And if MAC is not strong enough to do this bit of mathematics, why should NFU + Infinity + Choice be? (The two theories are not biinterpretable, so it's not literally true that they prove the same theorems.)

Personally, I'd rather have unrestricted comprehension. But then I may as well ask for a pony...