If set theory only contains the notions of “set” and “is a member of” as primitives, how can an axiom of set theory refer to a “formula”

axiomslogicmeta-mathset-theory

It's said that the primitive concepts of set theory are those of "set" and "membership", then all axioms of set theory must begin with "Let $A$ be a set" or "Let $x\in A$", but they don't. For example, let us consider the subset axiom:

Subset Axiom. Let $\varphi(x)$ be a formula and let $A$ be a set. Then there exists a set $S$ such that for all sets $x$ we have that $x\in S$ if and only if $x\in A$ and $\varphi(x)$.

This axiom begins with "Let $\varphi(x)$ be a formula" but "formula" is not a primitive concept, then, for having sense, it must be a defined concept, but as far as I see we can't define "formula" in terms of "set" and "membership" if I am wrong, please tell me. Now, in the case where we can't define "formula", how is the subset axiom justified from a logical point of view?

In many books, when the subset axioms is introduced, the statement "Let $\varphi(x)$" is used informally, I will appreciate if you recommend me a book on set theory where the concept of "formula" is used formally, where there is a formal definition of what a formula is.

Thank you for your reading.

Best Answer

In formalizations of set theory (e.g. ZFC), this subset axiom you've described in usually called the axiom schema of specification. Note the use of the word "schema" in its name: this is where your confusion stems from. Indeed, formally speaking, this is not a single axiom, but a whole collection of axioms: one for every formula $\phi(x)$. For every formula $\phi(x)$, there is an axiom of set theory which goes "$\forall A \exists S \forall x (x \in S \Leftrightarrow (x \in A \wedge \phi(x)))$". Taken together, all these axioms make up what you call the subset axiom.

Edit: Axiomatic set theory is based on first-order logic, which is the system of language and inferences that are used to do this sort of math. It starts with symbols (including variables, connectives like $\neg$ and $\wedge$, quantifiers $\forall$ and $\exists$, the $=$ symbol, and predicate symbols) which are combined according to certain rules to make formulas. First-order logic also uses rules of inference to deduce some formulas from others. Set theory uses first-order logic with the predicate symbol $\in$; its axioms and theorems are formulas. I suggest reading some more (on Wikipedia or in your favorite textbook) if you're interested in learning more about how first-order logic works.

Edit 2 (re: comments): I think you misunderstand the distinction between the mathematical universe and the syntax used to describe it. The mathematical universe is only composed of sets; it does not contain symbols. Symbols are what make up formulas, and formulas are what we use to describe the mathematical universe and assert things about it.

"Set theory" is, in a way, a particular usage of FOL. The language of FOL is built out variables, connectives, and quantifiers (these are the logical symbols), as well as constants, functions, and predicates (these are the non-logical symbols). (See Wikipedia.) You can't change the logical symbols of the language, but you're free to select what symbols you want to use for constants, functions, and predicates. Set theory is a specific use case of this, where there are no constants or functions, and there is only one predicate: $\in$. If we are given axioms, then we can use them to prove theorems using the FOL deduction system. The things we can prove using the set theory axioms are "theorems of set theory".

Another important note is the domain of discourse. Informally, the domain of discourse is what we quantify over. When we write something like $\exists x (\forall y \neg (y \in x))$, what are $x$ and $y$? We say that they belong to the domain of discourse. In set theory, we say that the domain of discourse is the universe of sets. But, importantly, if we're just using FOL, the domain of discourse isn't something we ever "refer to". If you think of logic as just a game of manipulating strings of symbols, then the domain of discourse doesn't really exist. (It becomes useful when we start talking about model theory, but that's another story.)

The point is, "set theory" just refers to the particular case of FOL where we have one 2-ary predicate $\in$, and we assume the set theoretic axioms. The primitive concept of membership appears as the predicate $\in$; the primitive concept of set doesn't really appear at all: it's just the name we give to things that refer to the domain of discourse.

So, that's the mathematical universe. It's things we call sets, which we make statements about using a particular kind of FOL.

But, if we want to start talking about FOL itself, we're firmly outside of the mathematical universe. (You might call it "meta-mathematical".) The description of FOL involves variables and symbols and formulas and such, but we're no longer talking about sets; we are talking about the form of our language. By its nature, we can only talk about such things informally. Of course, you could use a formal logical language to talk about FOL (and in fact we do when we do model theory!), but from a foundations perspective, this just creates an infinite regress problem.

So, whenever you're building foundations for math, the logical system (FOL or another) has to be built "from scratch". Formulas and symbols are not defined in terms of "set" or "membership"; you can only give somewhat informal definitions.

To go back to your original question: the subset axiom as you have written it is not "an axiom". Here is perhaps a better way of phrasing it. Given any formula $\phi(x)$ (of FOL), we have the following axiom of set theory.

Subset Axiom for $\phi(x)$. Let $A$ be a set. Then there exists a set $S$ such that for all sets $x$, we have that $x \in S$ if and only if $x \in A$ and $\phi(x)$.

For any formula $\phi(x)$, say for instance $\forall y ¬(y \in x)$, there is a corresponding subset axiom. And, for any $\phi(x)$, we can formalize the subset axiom for $\phi(x)$ as a sentence in FOL: it is $\forall A \exists S \forall x (x \in S \Leftrightarrow (x \in A \wedge \phi(x)))$. However, you cannot formalize "for any formula $\phi(x)$, the subset axiom for $\phi(x)$ holds" as a sentence in FOL.

Fundamentally, in your statement of the subset axiom, "let $\phi(x)$ be a formula" and "let $A$ be a set" are very different, despite seeming very similar. The latter can be formalized into part of an FOL sentence ($\forall A$), while the former cannot. Perhaps this was the source of your confusion.

Related Question