Trying to understand the difference between metatheory and theory and circularity

formal-proofslogicset-theorysoft-question

First off I just want to say that I understand that a model is not the same as the thing it models. I've already read several answers on this topic so I am looking for a new answer to hopefully address my specific understanding of this, which I am trying to write below after having read many threads.

Is it accurate to say that when we write down the rules for a propositional logic system, or a first order logic system, or a set theoretic system, we are defining these formal systems as the "theory" and our ability to analyze them from the outside is done as "metatheory"? These formal systems merely models of human intuition and mathematical reasoning that we've used informally for years?

For example if I am using ZFC axioms to prove something, that's "using the theory," but picking apart the axioms themselves, analyzing them using whatever methodologies I want, that's "metatheory"?

And it's not considered circular if, for instance, in the process of proving or analyzing something about a system, I invoke a similar system in the metatheory? This question is hard for me to write so maybe I will use an example: when proving the deduction theorem in propositional logic, we are writing a proof about proofs, and if you look at the proof it talks about sets, induction, cases, and so on, even though these things are usually formalized and defined in "higher" frameworks which gives off an impression of circularity.

But is the general idea that in metatheory, everything is fair game and not considered "connected" to anything in the theory insofar as circularity is concerned? I hope that question makes sense. For instance if I want to prove something about set theory and I appeal to something that relies on some notion of sets or collections, that's not circular?

Best Answer

Most of the time in mathematics we don't study logic; we study other things like real numbers, groups or differential equations. When we're studying these things we don't often stop to worry about whether our methods are properly justified. We simply use the methods of everyday mathematics to define objects and prove facts about them.

I suggest you think about logic in the same way.

For example, the theory of propositional logic, $\sf P$, is just a particular mathematical object. To define it we take any $10$ element set, and use its elements to represent the symbols "$($", "$)$", "$\implies$", "$\vee$", "$\wedge$", "$¬$", "$\bot$", "$\top$", "$p$" and "$'$" (we use "$p$", "$p'$", "$p''$"... as variable names so that we need only finitely many symbols). Then we define a "string" to be any finite ordered list of elements of that set. Then we define the set of "formulas", $\mathcal L$, to be a certain subset of the set of strings (those that make grammatical sense according to a particular definition that we give). Then we define "$\vdash_\sf P$" to be a particular relation between the powerset of $\mathcal L$ and $\mathcal L$. Specifically we say that $\vdash_\sf P\subseteq\mathcal P(\mathcal L)\times\mathcal L$ such that $S\vdash_{\sf P}\phi$ if and only if there exists a "formal proof" of $\phi$ from $S$, where we define "formal proof from $S$" to mean a finite list of elements of $\mathcal L$ such that each is either one of a specified set of "axioms", or is an element of $S$, or can be produced from two previous elements of the formal proof via an operation called "modus ponens". The theory $\sf P$ is then defined as the ordered pair $(\mathcal L,\vdash_\sf P)$.

So $\sf P$ is a particular mathematical object just like $2.932$ or $A_4$ or $\{f:\Bbb R\to\Bbb R|f'=f\}$. We can use mathematics to study it and say various things about it. For example we can prove that $S\vdash_\sf P(\phi\implies\psi)$ if and only if $S\cup\{\phi\}\vdash_\sf P\psi$, a result we call the Deduction Theorem.

Likewise we can define theories like $\sf PA$ or $\sf ZFC$ in a very similar way, using just a more complicated set of symbols and formulas. For each of these theories we can also define what it means to be a "model" of that theory, and we can prove some things about models.

We can do all of this without having to mention any connection between the theories we are studying and the methods we are using to study them. When we're proving theorems about $\sf P$ we just talk about combining strings and splitting them up in various ways, just like we might prove a theorem about groups by talking about multiplying and inverting elements in various ways. In particular the "formal proofs" we used in the definition of $\sf P$ are just mathematical objects, and are therefore different from the actual proofs that we write in journals using paper and ink (although of course we chose the definition of "formal proofs" so that they would resemble actual proofs).

So then the word "theory" has a precise mathematical definition. For a given language $\mathcal L$ and a given set of deduction rules, a theory is just an ordered pair $(\mathcal L,\vdash)$ such that $\vdash$ is a subset of $\mathcal P(\mathcal L)\times\mathcal L$ which is closed under the deduction rules. When we want to be unambiguous these are called "formal theories", to remind us that they are mathematical objects.

That just leaves the word "metatheory".

The metatheory is the "everyday mathematics" that you're using to study theories like $\sf P$. It's also what you use in all other areas of mathematics like group theory or analysis. The metatheory is different from $\sf ZFC$. The theory $\sf ZFC$ is a mathematical object created with the purpose of simulating the metatheory, so that everything that mathematicians do should correspond to something in $\sf ZFC$. But a theory is a mathematical object, whereas the metatheory is the real-world process that mathematicians use. So the relation between $\sf ZFC$ and the metatheory is like that between $S^2$ and a basketball. The first is a mathematical object that approximates the second.

Is it accurate to say that when we write down the rules for a propositional logic system, or a first order logic system, or a set theoretic system, we are defining these formal systems as the "theory" and our ability to analyze them from the outside is done as "metatheory"?

Yes.

These formal systems merely models of human intuition and mathematical reasoning that we've used informally for years?

Yes.

For example if I am using ZFC axioms to prove something, that's "using the theory," but picking apart the axioms themselves, analyzing them using whatever methodologies I want, that's "metatheory"?

Lets take the statement "all groups are nonempty" as an example. There's a formula $\psi$ in the language of $\sf ZFC$ that is intended to encode this sentence. If you prove that all groups are nonempty then you are using the metatheory. If you prove that $\emptyset\vdash_\sf{ZFC}\psi$ then you are using the metatheory to prove something about the theory. If you explicitly construct a formal proof of $\psi$ (i.e. you write down a bunch of symbols according to the specified rules such that the last line is exactly $\psi$) then you are, in some sense, using the theory $\sf ZFC$. I say "in some sense" because $\sf ZFC$ is actually a mathematical object and so can't actually be "used" at all; a more accurate description of what you've done is to prove that all groups are nonempty in a different metatheory that more closely resembles $\sf ZFC$ than does our current metatheory.

And it's not considered circular if, for instance, in the process of proving or analyzing something about a system, I invoke a similar system in the metatheory?

Giving a rigorous justification of the metatheory is an unsolved problem of a philosophical nature. It seems like an impossible paradox, since in order to give a justification of the metatheory we would have to be working in some already-existing framework, which would itself lack justification. Mathematical logic might give some insight into this paradox, but so far it hasn't resolved it.

But if we just take the metatheory as a given then we can use it to study theories in the manner I've described above. Of course we are on shaky philosophical ground if we do this, since the metatheory has no justification. But the ground we are on is no more shaky than it is when we use the same metatheory to study real numbers, groups or differential equations.

If we did try to found the metatheory on a theory then that would create a circularity, and so you can't justify the metatheory in this way. But you can study logic without trying to justify the metatheory.

But is the general idea that in metatheory, everything is fair game and not considered "connected" to anything in the theory insofar as circularity is concerned?

Right, you can use things in the metatheory that haven't been justified in the theory. As I've said, this is on philosophically shaky ground. But without doing this we wouldn't be able to get anywhere.

For instance if I want to prove something about set theory and I appeal to something that relies on some notion of sets or collections, that's not circular?

It's fine, go ahead. It's no more unjustified than any other proof in mathematics.

Related Question