Two-sorted first-order set theory

first-order-logiclogicmodel-theoryset-theory

I'm studying mathematical logic, and now I'm beginning to see Morse-Kelly set theory. The book I'm using first introduces this theory using the simple language $\mathcal{L}_\in$. In particular the first axiom is the Set-Existance axiom:
$$\exists x \exists y (x \in y)$$
Then it says that, and I quote

It is possible to formulate MK in a two-sorted language. Formally we have a binary relation symbol $\in$ and two disjoint lists of variables $x_0,x_1,x_2,\dots$ for sets, and $X_0,X_1,X_2,\dots$ for classes.

Then it says that the Set-Existance axiom follows logically from $$\exists x (x = x)$$ but on the other hand we'd need to add the "Set are Classes" axiom: $$\forall x \exists X (x = X)$$

Now, my problem with this is that nowhere in the book is formalized the concept of two-sorted first-order language, and, when used, it was always presented as an informal short-cut to keep things clean when dealing with structures whose domain is not homogeneous (like Vector Spaces), and to avoid the abuse of unary predicates meant to distinguish the various types of objects in the domain. And this is ok to me, but in this case it seems to me that the book is relying too heavly on this notation, almost suggesting that there is a sort of "interpretation" of variables as there is for constants.

The formalization of the above formulae, by introducing the appropriate unary predicates, would obliterate the usefulness of this language, so I don't quite get why making all this arguments based on a pile of sand. Am I missing something?

Best Answer

Two-sorted logic is not an "informal short-cut", it is actually a different way of setting up formal logic. In particular, the two different kinds of variables actually matter to the semantics.

So, if you set up Morse-Kelly set theory as a two-sorted theory, then a model is not just a set $M$ together with a relation $\in$ but rather a tuple $(M,S,C,\in)$ where $M$ is a set, $S$ and $C$ are subsets of $M$ (the sets and classes) whose union is $M$, and $\in$ is a binary relation on $M$. When we interpret formulas in such a structure, the set variables get interpreted as elements of $S$ and the class variables as elments of $C$. In other words, if $x$ is a set variable, then $M\models\forall x[\varphi(x)]$ is defined as $\forall a\in S, M\models \varphi(a)$, and similarly for class variables and existential quantifiers. Similarly, if you're studying proofs, then every variable has a sort and all the deduction rules need to respect these sorts, so for instance when you use universal instantiation you are only allowed to replace the quantified variable with another variable of the same sort. As a result, proofs will be sound with respect to this semantics.


Now a warning: the setup above is not quite the usual definition of multisorted logic. Usually in multisorted logic, the sets for different sorts are required to be disjoint (so $S$ and $C$ would be disjoint), and all the function and relation symbols must specify (as part of their "arity") the sorts of their inputs (and output, for functions). So for instance, to talk about something like modules over a ring, you would have a language with two sorts $R$ (the ring) and $M$ (the module), and for each of the different function symbols you would specify their input and output; for instance, there would be both an addition function $+_R$ which takes two inputs in $R$ and gives an output in $R$ and an addition function $+_M$ which takes two inputs in $M$ and gives an output in $M$, and there would be a scalar multiplication function $\cdot_M$ which takes an input in $R$ and an input in $M$ and gives an output in $M$. A model would then be a tuple $(X,R,M,+_R,+_M,\cdot_M,\dots)$ (there are a few more functions needed which I've omitted) where $R$ and $M$ are disjoint sets whose union is $X$, $+_R$ is a function $R\times R\to R$, $+_M$ is a function $M\times M\to M$, and so on. Moreover, in this language, not only does each variable have a type, but each term has a type: for instance, if $r$ and $s$ are $R$-variables, then $+_R(r,s)$ is an $R$-term, but $+_M(r,s)$ is not a well-formed term at all since we can't input $R$-variables into $+_M$.

This usual definition doesn't work so well for Morse-Kelly because we'd like to be able to talk about sets being equal to classes, and sets being elements of both sets and classes. To use disjoint types, in a model we would need to have, for each set, a "copy" of the set which is a class with the same elements but is not literally equal. We would also need to have two different element relations $\in_S$ and $\in_C$, where $\in_S$ is a relation between two sets and $\in_C$ is a relation between a set and a class. We couldn't actually talk about sets and classes being equal (since they never are!), and to say something like $\forall x \exists X (x = X)$ we would need to say that $x=X$ is actually just an abbreviation for $\forall y(y\in_S x\Leftrightarrow y\in_C X)$.