Is an interpretation just a homomorphism between theories

first-order-logiclogicmodel-theoryphilosophy

I don't understand model theory, but I think I've read enough about it that I can pretend to by parroting the language of model theorists. So instead of asking "how does model theory work," I'm going to state what I think I know and why it doesn't make sense, and then you can tell me why I'm wrong. I would also like to propose an alternative conception of model theory that resolves my objections, just in case I'm not as wrong as I think.

The Logical Argument

Let $\mathcal{T}$ be the theory consisting of one nullary function/constant $c$, one binary function $f$, and one binary relation $R$, with the following [non-logical] axioms:

$$\begin{align}
a_1&\qquad\forall x.\neg R(x,x)\\
a_2&\qquad\forall x.\forall y.R(x,y)\implies\neg R(y,x)\\
a_3&\qquad\forall x.\forall y.\forall z.R(x,y)\land R(y,z)\implies R(x,z)\\
a_4&\qquad\forall x.\forall y.f(x,y)=f(y,x)\\
a_5&\qquad\forall x.f(x,c)=x\\
a_6&\qquad\forall x.\exists y.f(x,y)=c\\
a_7&\qquad\forall x.\forall y.\forall z.f(x,f(y,z))=f(f(x,y),z)
\end{align}$$

Let $\mathcal{I}$ be the interpretation $\mathcal{I}(c)=“0"$, $\mathcal{I}(f)=“+"$, and $\mathcal{I}(R)=“<"$. Let $\mathcal{M}=(\mathbb{R},0,+<)$. Then, $\mathcal{M}\vDash \mathcal{T}$.

This means that for any formula $\varphi$, in the language of $\mathcal{T}$, if $\mathcal{T}\vdash\varphi$, then $\mathcal{M}\vDash\varphi$. In other words, the statements which are true of the [linear totally-ordered additive group of] real numbers is a theorem of $\mathcal{T}$ interpreted in the real numbers.

As it is usually stated, an interpretation gives semantic meaning to the expressions of a formal language. In this case, the interpretation treats $\mathcal{T}$ as a theory of real numbers.

Stop me here if I'm wrong.

Objection

Because "the real numbers" as presented above lack any internal structure (we have only the symbols $\mathbb{R}$, $0$, $<$, and $+$ to go off of, and these are no more meaningful than the symbols we started with), there is no way to determine the truth of a statement interpreted in $\mathcal{M}$. In order to determine the truth of a statement of the model, it would be necessary to build a theory $T_\mathcal{M}$ such that $T_\mathcal{M}\vdash\mathcal{I}(\varphi)$ iff $\mathcal{M}\vDash\varphi$. In this case, it would make more intuitive sense to treat $T_\mathcal{M}$ as a model of $\mathcal{T}$, and the interpretation as a homomorphism between $T_\mathcal{M}$ and $\mathcal{T}$ which preserves the relation $“\vdash"$.

The Philosophical Argument

The semantic theory of truth can be stated as follows:

For any valid statement $S$ of a formal language, "$S$" is true iff $S$.

e.g. "All swans are black" is true if and only if all swans are black.

Presumably, a statement is true in a given interpretation iff this rule holds- that is, for any statement $\varphi$, $\mathcal{M}\vDash\varphi$ iff the interpretation of $\varphi$ in $\mathcal{M}$.

Stop me here if I'm wrong.

Objection

For a statement whose referents are physical things, it is possible to produce a test which verifies that the statement is true independent of the sender and receiver (in other words, it is objectively true). In the case of "All swans are black," we can identify a particular configuration of particles as "swan," and a particular absorption spectrum as "black," then gather together all instances of "swan" in the world and stick them in a very large spectrometer. Both of these characteristics – "swan" and "black" – can be verified by measurement independently of any particular observer.

This is not the case for statements about hypothetical objects. There is no thing that we can point to and say "that is a real number." We cannot verify a statement about the real numbers (without first building a theory of real numbers) any more than we can assert that "all magic is octarine." In the best case, a statement "$S$" is true of the real numbers iff the consensus of mathematical literature says as much. In the worst case, "$S$" is an opinion.

This goes along with that age old question of whether or not mathematical truths would remain true in the absence of humans; based on the above it is very clear that they would not.

If the truth of a mathematical statement were instead considered in terms of the system [up to isomorphism] which produces that statement, then it would be possible to objectively "measure" mathematical truth through any physical implementation of that system (most likely a machine of some kind).

True in every Interpretation

As it is usually stated, a formula is valid iff it is true in every interpretation. Without placing restrictions on what an interpretation can be, it is clear that no formula is valid – just interpret a sentence in a nonsense manner and the result is not true. For example, $\varphi\lor\neg\varphi$ is not valid because I can declare a value "$C$" such that $C\land\neg C$. Because of this, I instead take "true in every interpretation" to mean "derivable from [classical] FOL alone." This way, the only valid formulas are theorems of FOL which are preserved in all first-order theories, courtesy of the logic itself. This amounts to "protecting" the logic of first-order theories against interpretation. This could probably be further clarified by introducing some form of type system to FOL (possibly derived from the grammar).

The Proposed Alternative

A first-order theory is a pair $(T,\vdash_T)$ such that [insert your choice of logical axioms and/or inference rules here]. An interpretation is a homomorphism between theories – i.e. if $\mathcal{U}=(U,\vdash_U)$ and $\mathcal{V}=(V,\vdash_V)$ are theories, and $\mathcal{I}:\mathcal{U}\to\mathcal{V}$ is an interpretation, then for any $\Sigma\subseteq U$ and $\varphi\in U$, $\Sigma\vdash_U\varphi$ iff $\mathcal{I}(\Sigma)\vdash_V\mathcal{I}(\varphi)$. We write $\mathcal{V}\vDash\mathcal{U}$ and say that $\mathcal{V}$ is a model of $\mathcal{U}$ to indicate that there is an interpretation from $\mathcal{U}$ to $\mathcal{V}$.

Thus, in the original example, $\mathcal{M}\vDash\varphi$ would follow from any proof that $\mathcal{I}$ is an interpretation or $T_\mathcal{M}\vdash\mathcal{I}(\varphi)$ (the latter being implied by the former).

A formula $\varphi$ is true in every interpretation, and we write $\vDash\varphi$ iff $\varphi$ is a formula of FOL [up to isomorphism]. This is because FOL is a subtheory of every first-order theory [up to isomorphism].

There is one possible benefit to this that doesn't appear in the more traditional explanation of models: it should be possible to construct interpretations between completely different logics/languages. This is probably extraordinarily difficult, but the possibility of building, say, a modal theory from FOL is quite exciting.

Response to Alex Kruckman

That is, the interpretations of the symbols in the language are not merely other symbols: they're honest functions and relations which we can reason about using ordinary (set-theoretic) mathematics.

Does this mean that model-theory is implicitly typed? Certainly we need to have at least two types ("string" and "object") for this to make sense?

However, in many cases we can easily determine the truth of a statement interpreted in $\mathcal{M}$… this follows from the fact that R is a field…
At the end of the day, the proof comes down to thinking about Dedekind cuts or Cauchy sequences and proving things about them in ordinary (set-theoretic) mathematics.

There's a point to made here that the set itself cannot be identified as "the real numbers," because set theory alone cannot tell the difference between "the field $\mathbb{R}$" and an arbitrary structure with the same cardinality. In order to get to the point where we can talk about real numbers, we have to build the theory of real numbers (or the theory of [totally-ordered, etc.] fields, of which $\Bbb{R}$ is the unique [up to isomorphism] example of cardinality $2^{\aleph_0}$). Implicitly, this is what we are doing when we introduce new concepts like "Dedekind cut," and "sequence" to the extant set-theoretic vocabulary.

Proving $\forall r.\exists q.q+q=r$ means building the language, creating the inference rules, and stating the axioms of $\mathbb{R}$ – without this, $\mathbb{R}$ is no more "the set of real numbers" than $2^{\aleph_0}$ is the "the set of real numbers."

This is where the idea of "homomorphisms between theories" came from. If we're going to put in all the work to build a theory so that we can define "the real numbers," why not just start with that theory. Besides, since "the set itself" does not actually exist (as explained in "The Philosophical Argument"), we can only really talk about the theory anyway.

On a side note, I'm not sure that I can assume set theory since different set theories are not generally bi-interpretable and we can still have models of the same theory in different set theories. It's probably possible as well to use a non-set-theoretic (even non-first-order) formal system to create a model. I think I've seen something like this happen in philosophical uses of model theory, where the objects of discussion are not considered in a set-theoretic (or even mathematical light).

So regardless of whether $\mathcal{M}\vDash\varphi$ is true or not, we have that $\mathcal{M}\vDash\varphi\lor\neg\varphi$. So $\varphi\lor\neg\varphi$ is valid (true in every structure).

This is what I meant by "protecting the logic against interpretation." Suppose that $T$ is a propositional theory. We can interpret an object $\varphi$ of $T$ in some non-classical logic so that it is indeterminate, in which case $\varphi\lor\neg\varphi$ will not hold. This is not the case if we insist that the logical symbols of the theory must be preserved under interpretation (again, typing helps here).

Clarification:

Noah Schweber writes in a comment:

…you seem to be mixing up "true in every mode in the specific sense of first-order logic" with "true in every possible logical system." These are absolutely not the same, and the former is quite precise and limited

While I trust that this is correct, the explanations that I am familiar with state only that an interpretation "assigns a meaning to the symbols of a formal theory," usually without qualifying the scope of "meaning." Taking this at face value, any assignment is fair game.

That being said, without some form of typing there is little to distinguish the logical from non-logical components of a theory. If the definition of "interpretation" is such that an interpretation always preserves the meaning of logical symbols, then this ought to be stated rather than assumed.

Best Answer

There's a lot going on here - some interesting ideas and some serious misconceptions! So this will be a long answer. The main question of whether we can view models as homomorphisms between theories is an insightful one. Let me start with the issues I see in your account, and then I'll try to address that question.

It seems to me that the main thing that's missing is an accurate view of the core notion of "structure"/"model" in model theory: Note that I use the word "structure" rather than "interpretation", as you do in your question. This is because I prefer to reserve the word "intepretation" for the notion of interpretation between theories. Actually, I think the relationship between interpretations (in the sense of structures) and interpretations (between theories) is really what you're asking about. More on that later.

A structure for the language $L$ is a set $A$, equipped with functions $f\colon A^n\to A$ and relations $R\subseteq A^n$ interpreting the function and relation symbols in the language $L$. That is, the interpretations of the symbols in the language are not merely other symbols: they're honest functions and relations which we can reason about using ordinary (set-theoretic) mathematics.

Aside: At this point, the skeptic sometimes shouts: "Wait just a minute! We're doing logic here. First-order logic is supposed to be prior to set theory. How come you're talking about sets?" To which the model theorist replies: "No, model theory is a subfield of mathematics just like group theory or topology, and the objects of our study are sets with extra structure, just like in those other subfields. You can use first-order reasoning in your foundations of mathematics without talking about models at all. Likewise, you can study first-order theories and their models as an ordinary mathematical pursuit without worrying about the foundations of mathematics."

Back to your question. You write:

Because "the real numbers" as presented above lack any internal structure (we have only the symbols $\mathbb{R}$, $0$, $<$, and $+$ to go off of, and these are no more meaningful than the symbols we started with)...

But the real numbers have a huge amount of internal structure! When we talk about $\mathbb{R}$ as a model for your theory $\mathcal{T}$, $\mathbb{R}$ is really the set of real numbers, i.e. the set of Dedekind cuts or equivalence classes of Cauchy sequences.

...there is no way to determine the truth of a statement interpreted in $\mathcal{M}$.

I want to be clear here that for a general $L$-structure $\mathcal{M}$ and a general $L$-sentence $\varphi$, there is a mathematically precise definition of what it means for $\varphi$ to be true in $\mathcal{M}$ (denoted $\mathcal{M}\models \varphi$). This is essentially what you describe in your section "The Philosophical Argument". And what you say is true: for a general structure $\mathcal{M}$ and a general $L$-sentence $\varphi$, there is no general algorithm by which we can determine whether $\mathcal{M}\models \varphi$ or $\mathcal{M}\not\models \varphi$.

However, in many cases we can easily determine the truth of a statement interpreted in $\mathcal{M}$. For example, consider the sentence $\varphi$ given by $\forall x\, \exists y\, f(y,y) = x$. There is no proof of $\varphi$ from your theory $T$ (as we can see by noting that there are ordered abelian groups like $(\mathbb{Z},0,+,<)$ in which $\varphi$ is not true), but letting $\mathcal{M} = (\mathbb{R},0,+,<)$, we can prove that $\mathcal{M}\models \varphi$. How? By proving that for any real number $r$, there is a real number $r/2$ such that $r/2+r/2 = r$. How? Well, this follows from the fact that $\mathbb{R}$ is a field whose of characteristic $\neq 2$, but how do we know that's true? At the end of the day, the proof comes down to thinking about Dedekind cuts or Cauchy sequences and proving things about them in ordinary (set-theoretic) mathematics.

As it is usually stated, a formula is valid iff it is true in every interpretation. Without placing restrictions on what an interpretation can be, it is clear that no formula is valid - just interpret a sentence in a nonsense manner and the result is not true. For example, $\varphi\lor\lnot\varphi$ is not valid because I can declare a value "$C$" such that $C\land \lnot C$.

This is just wrong. As I mentioned above, "true in an intepretation (structure)" has a precise mathematical meaning, according to which many sentences are valid. Using your example, let $\varphi$ be any sentence, and let $\mathcal{M}$ be any structure. The meaning of "$\mathcal{M}\models \varphi\lor \lnot \varphi$" is "$\mathcal{M}\models \varphi$ or $\mathcal{M}\models\lnot \varphi$", which means "$\mathcal{M}\models \varphi$ or it is not true that $\mathcal{M}\models \varphi$". So regardless of whether $\mathcal{M}\models \varphi$ is true or not, we have that $\mathcal{M}\models \varphi\lor \lnot \varphi$. So $\varphi\lor\lnot \varphi$ is valid (true in every structure).

Because of this, I instead take "true in every interpretation" to mean "derivable from [classical] FOL alone"...

Well, it turns out that "true in every interpretation (structure)" is equivalent to "derivable from classical FOL alone", but this is a (hard!) theorem (the completeness theorem).

...This amounts to "protecting" the logic of first-order theories against interpretation.

If you like, the completeness theorem says that first-order theories don't need your "protection". You can search for formal proofs from $\mathcal{T}$, or you can reason in ordinary mathematics about all models of $\mathcal{T}$, and you'll come up with exactly the same set of logical consequences of $\mathcal{T}$! This is the beautiful thing about the completeness theorem: if we want to prove theorems about groups, we don't need to tie our hands behind our backs and allow only formal first-order proofs in the language of groups. We can use all of the tools of ordinary mathematics, freely considering subgroups and homomorphisms between different groups, using results from number theory, etc. etc.


Now what about homomorphisms between theories?

First, the correct notion of "homomorphism between theories" (even in different languages) is that of interpretation (between theories). The best place to read about this is in Model Theory by Hodges, Section 5.3 (or A Shorter Model Theory, Section 4.3). Basically, to interpret an $L$-theory $T$ in an $L'$-theory $T'$, you give an $L'$-formula $\varphi_D$ (the domain) and an $L'$-formula $\varphi_=$ which defines an equivalence relation on the domain, and for every symbol in the language $L$, you give an $L'$-formula which defines an element or function or relation of the appropriate arity on the equivalence classes in the domain, such that the translations of all sentences in $T$ are provable in $T'$.

Now a model of $T$ is not quite the same thing as an interpretation of $T$ in ordinary mathematics (which we can take to be ZFC set theory). There are two key differences.

First, given an interpretation of $T$ in ZFC, the domain of the interpretation is not a set, but rather a formula in the language of set theory, which defines a class, possibly a proper class. In model theory, we usually require a model to be a set, rather than a proper class, because we want to be able to do set-theoretic constructions with our models.

Second, remember that the domain of a model and the interpretations of symbols in the model are truly arbitrary - there's no requirement that they be definable by formulas in the language of set theory. Indeed, if $T$ is a theory with infinite models, then $T$ has a properly class of models, by Löwenheim-Skolem. But there are only countably many formulas in the language of set theory, so only countably many possible interpretations of $T$ in ZFC (if the language of $T$ is finite).

You can also look at the difference like this: Suppose $T$ is a theory and $T'$ is a set theory. If $T'$ proves that $T$ has a model, then $T'$ proves the consistency of $T$ (assuming that $T'$ can talk about $T$ and the notion of consistency and can prove the soundness of first-order logic - any reasonable set theory can do these things). On the other hand, if there is an interpretation of $T$ in $T'$, then we can prove (at the meta level) that if $T'$ is consistent, then $T$ is consistent. So there are different levels going on here...

If you want the traditional notion of "model of $T$" to line up exactly with "interpretation of $T$ in $T'$" for some theory $T'$, you can take $T'$ to be the full theory of the universe of sets. By this I mean the multi-sorted first-order theory in the (proper class sized) language with a sort $S_X$ for each set $X$, a constant symbol $c_x$ of sort $X$ for each element $x\in X$, a function symbol of type $\prod_{i=1}^n S_{X_i}\to S_Y$ for each function $\prod_{i=1}^n X_i\to Y$, and a relation symbol $R_A$ of type $\prod_{i=1}^n S_{X_i}$ for each relation $A\subseteq \prod_{i=1}^n X_i$. The axioms of $T'$ are all first-order sentences in this language which are true in the universe of sets. The multi-sortedness ensures that all definable sets relative to $T'$ correspond to actual sets, and because absolutely everything is in the language, we can interpret the symbols in $T$ as totally arbitrary functions and relations, just as we can in the usual definition of "structure". It shouldn't be surprising that we need a proper class sized language to handle the flexibility of the notion of "structure" - as I noted above, there is a proper class of structures for any fixed language $L$, but given a set sized language $L'$, there can only be a set of interpretations of the language $L$ in the language $L'$.

These ideas have their ultimate expression in categorical logic. From the categorical logic point of view, every theory $T$ is viewed a category $C_T$ equipped with certain kinds of structure, corresponding to the logical operations. e.g. formulas are objects and conjunction $\land$ might correspond to categorical product. Then a model of a theory is a structure-preserving functor from $C_T$ to some other category $D$ with the same kind of structure. (To recover traditional models, we take $D$ to be the category of sets.) And an interpretation between theories $T$ and $T'$ is a structure-preserving functor $C_T \to C_{T'}$. Now interpretations and models are exactly the same kind of thing: structure-preserving functors. And pushing this further, every category with logical structure can be viewed as a theory, e.g. the semantic category $D$ in which our models live is just $C_{T''}$ for some other theory $T''$, and the distinction between models and interpretations disappears entirely. (If you do this with first-order logic and $D = $ the category of sets, you get $T'' = $ the full theory of the universe of sets, from the last bullet point.

This question and its answers are relevant.

Finally, at the end of your question, you mention interpretations between different logics. This is also a well-known idea (see e.g. the double negation translation of classical logic into intuitionistic logic). But it lives at a different level than interpretations between first-order theories. In intepretations between theories, we explain how to translate the non-logical symbols from one language into formulas in another, while preserving the meanings of the logical connectives. In translations between logics, we translate the logical connectives of one logic into formulas in another.

Related Question