Logic – How Modal Logic Differs from Ordinary Predicate Logic

logicmodal-logic

I am just beginning to study basic modal logic as described here (up to page 5 so far). My first impression — I'm sure it can't be right — is that it is just different symbols for the same concepts in ordinary predicate logic. Instead of $\forall x: P(x)$, we write $\square P$ (quantifying over a domain of discourse corresponding the set of all "possible worlds.") Instead of $\exists x: P(x)$, we write $\diamond P$. It also seems that the other logical connectors are the same as those in propositional logic. What can we do in this basic modal logic that we cannot do in predicate logic or vice versa?

EDIT: No need to reinvent the wheel. See Standard Translation (from modal to FOL) at https://en.wikipedia.org/wiki/Standard_translation

FOLLOW-UP: Using these Standard Translations, I was able to formally derive a number of "axioms" of modal logic (some said to be controversial at wiki). Theorems 1-5, make no use of any restrictions on the accessibility relation R. The remainder variously make use of reflexive, symmetry and transitive properties on R.

Best Answer

Your impression is right, but missing the point in some sense: modal logic is strictly less powerful than first-order logic, and this is one of the reasons it is so important in various contexts (especially applications of logic in computer science)! The reason is that there is a fundamental "power-versus-tameness" tradeoff implicit in any choice of logic, and we often prefer the latter to the former. Modal logic should be thought of as a particularly well-behaved fragment of first-order logic: we're often interested in decidable (or similarly nice) fragments of first-order logic in applications, and modal logic and its variants provide a wide swath of examples of such logics.

Note that this is reflected in the history of modal logic: it long predated first-order logic, and was an expansion of propositional logic by adding modal operators. After first-order logic burst on the scene, we came to understand modal logic as an intermediate logic, and that's the perspective I'm describing here since I think it matches more the perspective you're adopting.

This paper by Vardi is a useful source in this regard. In particular, the following passage from page $2$ is quite relevant:

  • "There are two main computational problems associated with modal logic. The first problem is checking if a given formula is true in a given state of a given Kripke structure. This problem is known as the model-checking problem. The second problem is checking if a given formula is true in all states of all Kripke structures. This problem is known as the validity problem. Both problems are decidable. The model-checking problem can be solved in linear time, while the validity problem is PSPACE-complete. This is rather surprising when one considers the fact that modal logic, in spite of its apparent propositional syntax, is essentially a first-order logic, since the necessity and possibility modalities quantify over the set of possible worlds, and model checking and validity for first-order logic are computationally hard problems. Furthermore, the undecidability of first-order logic is very robust. Only very restricted fragments of first-order logic are decidable ..."

Vardi goes on to talk about types of tameness, specifically focusing on two ways of generating tame fragments of first-order logic - bounding the quantifiers and limiting the number of variables - and then argues that modal logic really represents a third, and extremely robust, kind of tameness. At this point we move beyond the focus of this specific question; the point I want to make is that modal logic is not a strengthening of first-order logic, but rather quite the opposite, and that for many applications this is actually a good thing.

The last section of Chagrov and Zakharyaschev's book is also relevant, and in general I strongly recommend that book: it's quite dense, but has a huge wealth of material.

It's worth noting that the idea of looking at "tame" fragments of "wild" logics appears all over the place, e.g. the analysis of monadic second-order logic (contra full second-order logic) and the decomposition of $\mathcal{L}_{\omega_1\omega}$ into well-behaved countable sublogics. Coming from a more set- or model-theoretic background, it may seem odd at first to apply the same idea to first-order logic because of how "primal" it is, but it is in fact a very rich line of research.


To add a bit of detail, here's the translation of modal logic into first-order logic (well, for Kripke frames anyways; I'll leave generalizations as exercises):

Given a Kripke frame $\mathcal{K}=(W,\leadsto,\models_\mathcal{K})$ ($W$ = worlds, $\leadsto$ = accessibility relation, $\models_\mathcal{K}$ = valuation) in a propositional language $\Sigma=\{p_i\}_{i\in I}$, our corresponding language $\Sigma_\mathcal{K}$ consists of a unary predicate $P_i$ for each $i\in I$ and a binary relation symbol $R$, and our corresponding structure $M_\mathcal{K}$ has domain $W$, interprets $P_i$ as $\{w\in W: w\models_\mathcal{K} p_i\}$, and interprets $R$ as $\{(u,v)\in W^2: u\leadsto v\}$.

(Note that this is not quite what you've described: you've described the "local" version, where we focus on a single world in $\mathcal{K}$.)

Now, for every modal sentence $s$ in the language $\Sigma$, we get a first-order formula $\varphi_s(x)$ in one free variable saying that $s$ holds at $x$ in the sense of $\mathcal{K}$; meanwhile, the formula $\psi_s\equiv\forall x(\varphi_s(x))$ says of course that $s$ holds throughout $\mathcal{K}$. The key point here is:

There are first-order sentences not coming from modal sentences in this, or any reasonable, way.

For example, consider something like "$\forall x\exists y(R(x,y)\vee R(y,x))$." How precisely can you express this modally, in any sense?


EDIT: That said, there are aspects of modal logic which take it beyond first-order. (I previously said a bit about this in comments, but I think now it's a good idea to put it in the answer body.) In particular, we say that a frame (without a chosen valuation) validates a given modal sentence if every valuation makes that sentence true at every world. Each modal sentence $\varphi$ then defines a class of frames $V(\varphi)$. For example, $$(\Diamond\Diamond p)\iff(\Diamond p)$$ is validated exactly in the transitive frames.

Viewing (sans valuation) frames as directed graphs, we can ask whether every "modal validation" class $V(\varphi)$ is an elementary class. The answer turns out to be no, the easiest example in my opinion being the Lob axiom $$\lambda\equiv\Box(\Box (p)\rightarrow p)\rightarrow \Box(p).$$ (Proof: it's not hard to show that $V(\lambda)$ is the set of transitive, converse well-founded frames, that is, those transitive frames not admitting any infinite sequence of worlds each of which sees the next. Now use the compactness theorem.)

See also j4n bur53's answer, and this paper of Thomason. Note that when people say that a given modal sentence has no first-order equivalent, or isn't first-order expressible, they're talking about validity.

Related Question