[Math] Difference between model and interpretation

logicset-theory

In the book Mathematical Logic by J. Shoenfield, the author uses the concept of an interpretation of set theory to prove consistency results, while the other texts on set theory (e.g. Kunen and Jech) use models of set theory. What is the difference between these two concepts? I'm quite confused, because of Gödel's second incompleteness theorem and Tarski's undefinability of truth for class models.

Best Answer

I'm not a professional set theorist, nor have I worked extensively on the subject, so what I'm about to say here may be inexact. In any case, it seems to me that, if you're looking for a reference on this, I'd recommend checking out the section "Models of Set Theory" in Kunen's Set Theory book, which seems to be a decent introduction to this type of discussion (cf. in particular, p. 99); another good source is Hinman's Fundamentals of Mathematical Logic, in particular chapter 6. I'm going to assume you're familiar with the basic definition of interpretation, by the way; if necessary, I can add that later (but you may check, besides Shoenfield himself, Hinman's book, section 2.6).

Let's start with the basic difference between a theory being relatively interpretable $\mathsf{ZFC}$ and there being a model of the theory. So, first, what does it mean to say that a theory $T$ is relatively interpretable in $T'$? Very roughly (again, you can find precise definitions in the above sources), it means that you can define the basic vocabulary of $T$ in $T'$ and show that $T'$ thus extended proves, for each axiom $\phi$ of $T$, the relativization $\phi^{(I)}$ of $\phi$ (this is what Shoenfield calls the Interpretation Theorem; cf. p. 62). In particular, it follows directly from this that, for every theorem $\phi$ of $T$, $\phi^{(I)}$ is also a theorem of $T'$, whence if $T$ is inconsistent, so is $T'$ (by contraposition, if $T'$ is consistent, so is $T$). If you've read Shoenfield's book, this should be clear enough. Note that this notion of interpretation is purely syntactic and doesn't involve any idea of a model.

What about the idea of a model of a theory? Now, at the risk of being inaccurate, it seems to me that the basic picture is the following. In general, when one talks about a model of a given theory, one is talking about a given set which satisfies the theory, e.g. the set $\mathbb{N}$ (with suitable operations) may be taken to be a model of Peano Arithmetic. But what does that really mean, to say that a set satisfies the theory? Roughly, it usually means that $\mathsf{ZFC}$ proves that there is a set satisfying a theory. In particular, in the case of Peano Arithmetic, it means that (a) $\mathsf{ZFC}$ proves that, for every formula of Peano Arithmetic, there is a corresponding set which can represent it (a kind of Gödel coding or gödel set); (b) for each axiom of Peano Arithmetic, $\mathsf{ZFC}$ also proves that there is a corresponding set which represents it and that this set is an axiom of Peano Arithmetic; (c) $\mathsf{ZFC}$ proves that there is a set which satisfies the set of all axioms of Peano Arithmetic.

But, by the definition of interpretation, isn't this just to show that Peano Arithmetic is interpretable in $\mathsf{ZFC}$? It seems so, for, according to (a) and (b), the basic vocabulary of Peano Arithmetic can be defined in $\mathsf{ZFC}$ and, by (c), every axiom of Peano Arithmetic will also be true relative to a certain set in $\mathsf{ZFC}$, so of course their relativizations to this set will also be proven in $\mathsf{ZFC}$. However, here we should be aware of a crucial difference: in the case of an interpretation, one does not require something as strong as clause (c) to take place. If you analyze clause (c) attentively, you'll notice that it basically requires the following. Let $\mathrm{Ax}(x)$ be a formula defining "is an axiom of Peano Arithmetic" inside $\mathsf{ZFC}$ and let $M(x, y)$ stand for the relation "$x$ is a model of $y$" (this is all definable inside $\mathsf{ZFC}$; for details, check out Hinman's book, section 6.6). Given this, what clause (c) requires is something like $\mathsf{ZFC} \vdash \forall x (\mathrm{Ax}(x) \rightarrow \exists y M(y, x))$. Crucially, however, an interpretation does not require this; what it does require is simply that we can prove, in the meta-theory, that for each axiom $\phi$ of $T$, $T' \vdash \phi^{(I)}$ (thus, theoretically, the theory $T'$ does not even need to have the necessary resources to express the predicate "is an axiom of $T$"). This is a very important point: the difference between a theory being interpretable in $\mathsf{ZFC}$ and a theory having a model in $\mathsf{ZFC}$ is that the former is proven in the meta-theory, while the latter is proven inside the theory. That is, speaking figuratively, if a theory $T$ is interpretable in $\mathsf{ZFC}$, $\mathsf{ZFC}$ may not "know" that it proves every axiom of $T$ (it may not even know what is an axiom of $T$), while if a theory has a model in $\mathsf{ZFC}$, then $\mathsf{ZFC}$ knows that it proves the existence of a set satisfying every axiom of $T$.

We can now tackle the idea of relative consistency proofs. What does it mean to say that, e.g., that $\mathsf{ZFC} + \mathsf{CH}$ is consistent with $\mathsf{ZFC}$? One way of interpret this is to say that there is a model which is both a model of $\mathsf{ZFC}$ and a model of $\mathsf{ZFC} + \mathsf{CH}$. But if we interpret "there is a model" as being "$\mathsf{ZFC}$ proves that there is such a model", then, by the above, this seems to imply that $\mathsf{ZFC}$ proves that there is a model of both theories, whence, a fortiori, $\mathsf{ZFC}$ proves that there is a model of $\mathsf{ZFC}$, which runs afoul of Gödel's second incompleteness theorem (assuming, of course, the consistency of $\mathsf{ZFC}$). Therefore, this can't be the desired interpretation. In that case, what is the desired interpretation?

One answer is sketched in Kunen's book (cf. p. 99). What's happening here is that the theory $\mathsf{ZFC} + \mathsf{CH}$ is relatively interpretable in $\mathsf{ZFC}$, so, for each theorem $\phi$ of $\mathsf{ZFC} + \mathsf{CH}$, $\mathsf{ZFC}$ proves $\phi^{(I)}$; thus, if $\mathsf{ZFC}$ is consistent, so is $\mathsf{ZFC} + \mathsf{CH}$. Again, notice that this doesn't mean, however, that $\mathsf{ZFC}$ proves that, for every theorem $\phi$ of $\mathsf{ZFC} + \mathsf{CH}$, $\mathsf{ZFC}$ proves $\phi$! Anyway, for this particular case, one shows that there is a certain class definable in $\mathsf{ZFC}$ (call it $L$) and such that, for each axiom $\phi$ of $\mathsf{ZFC} + \mathsf{CH}$, $\mathsf{ZFC}$ proves $\phi^{(L)}$ (i.e. the relativization of $\phi$ to $L$). In other words, one defines in $\mathsf{ZFC}$ a new predicate,$L$, and an interpretation of $\mathsf{ZFC} + \mathsf{CH}$ using this new predicate as our domain. This yields directly a proof of the relative consistency of $\mathsf{ZFC} + \mathsf{CH}$ to $\mathsf{ZFC}$. Notice that this is done purely syntactically: one is not working really with a "class" $L$, but with a predicate defined in a theory (incidentally, that this predicate defines a class is not necessarily problematic: the predicate "is an ordinal" is obviously definable in $\mathsf{ZFC}$, yet it defines a proper class). Similarly, as I mentioned above, the notion of an interpretation only mentions theorems and theories, so it's also syntactic. This bypass the conundrum above of using models for $\mathsf{ZFC}$, which seems to be problematic. This is Shoenfield's approach, as you can see from glancing just his section on constructible sets (he proceeds by defining a series of function symbols in $\mathsf{ZFC}$, thus enlarging it to a new theory, and then introducing the $L$ predicate by using an existential quantification over one the new function symbols).

So why talk about models at all? Notice that, although the notion of interpretation does not involve models, that does not mean one can't think of it in terms of models. In particular, it's not difficult to show that, if $T$ is interpretable in $T'$, then any model of $T'$ can be converted into a model of $T$, so that, if $T'$ has a model, so does $T$ (cf. Hinman's book, section 2.6 for the result). Therefore, instead of just talking about interpretations, one can generally define a set $M$ in $\mathsf{ZFC}$ such that, for every axiom $\phi$ of the theory $T$ that we want to prove the relative consistency to $\mathsf{ZFC}$, $\mathsf{ZFC}$ proves $\phi^{(M)}$. This will show that, for every model $\mathfrak{A}$ of $\mathsf{ZFC}$ will contain a substructure $\mathfrak{M}$ having $M$ as its universe and such that for every axiom $\phi$ of $T$, $\mathfrak{M}$ will be a model for $\phi^{(M)}$ (or rather, for $\phi$ relativized to the definition of $M$). But this is just a fancy way of talking about interpretations: what one is basically showing is that the theory $T$ is relatively interpretable in $\mathsf{ZFC_m}$, the expansion of $\mathsf{ZFC}$ to the theory which includes the symbol $M$ and its corresponding definition. It seems then that, when writers like Jech and Kunen are talking about class models, etc., they are not strictly speaking talking about models (this is more or less clear in Kunen); rather, they are just using a vivid way of speaking about interpretations. Evidently, this does not mean that there aren't huge differences in exposition between Jech and Kunen on the one hand and Shoenfield in the other, or even in the technical devices used to construct the interpretations. It's just to say that they are all constructing interpretations!

One last note: if memory serves me well, it's also possible to use the Reflection Theorem (for every finite subset of $\mathsf{ZFC}$'s axioms, $\mathsf{ZFC}$ proves that there is a model for this subset) to generate a real model (i.e. a model in the strict sense) of sufficiently many axioms of $\mathsf{ZFC}$ and then use this model to construct further real models of theory in question. Kunen used this device in his 1980 book (I'm not sure about the new book, I haven't studied it in detail yet), and it's one way of "talking about models" while bypassing the problems outlined above. Of course, it's a bit vague to say "sufficiently many" axioms (which is why my professor preferred to use interpretations directly), but in practice that doesn't matter much.

Anyways, I hope the above is not too confusing or a mess of mistakes. As I said, I'm not really an expert on all of this (in fact, this is also a question that vexes me), and I wrote this answer partially so that others can help me spot my own misunderstandings of this problem.