Tarski's theorem is about the undefinabilty of a truth predicate in a model.
To take a concrete example, let $\mathfrak{N}$ be the standard model for $\mathbf{PA}$. Then a relation $R(x_1,...,x_k)$ is definable in $\mathfrak{N}$ if there is a (first order) formula $\varphi(x_1,...,x_k)$ in $L_0$ such that, for all $n_1,..., n_k \in \mathbb{N}$,
$$ R(n_1,...,n_k) \iff \mathfrak{N} \vDash \varphi(\bar{n}_1,...,\bar{n}_k).$$
Then Tarski's Theorem says that there is no formula $TR$ in $L_0$ such that
$$\text{$\varphi$ is true} \iff \mathfrak{N} \vDash TR(\ulcorner\varphi\urcorner).$$
This is a semantic notion: for models. But it follows from the impossibility of a formula $TR$ such that $\mathbf{PA} \vdash \varphi \leftrightarrow TR(\ulcorner\varphi\urcorner)$ by the Diagonal Lemma:
Diagonal Lemma (for PA) If $\psi(x)$ is a formula of $L_0$ with $x$ free, then there is a sentence $\varphi$ in $L_0$ such that
$$\mathbf{PA} \vdash \varphi \leftrightarrow \psi(\ulcorner\varphi\urcorner).$$
We can use $\lnot TR(x)$ instead of $\psi(x)$ above to get a sentence $\lambda_{\mathbf{PA}}$ (for liar!) such that
$$\mathbf{PA} \vdash \lambda_{\mathbf{PA}} \leftrightarrow \lnot TR(\ulcorner\lambda_{\mathbf{PA}}\urcorner).$$
But $TR$ is supposed to satisfy
$$\mathbf{PA} \vdash \lambda_{\mathbf{PA}} \leftrightarrow TR(\ulcorner\lambda_{\mathbf{PA}}\urcorner),$$
which clearly yields a contradiction with the statement above.
Hence, assuming $\mathbf{PA}$ is consistent, there is no formula $TR$ in $L_0$ such that $\mathbf{PA} \vdash \varphi \leftrightarrow TR(\ulcorner\varphi\urcorner)$.
Furthermore, if $\mathbf{PA}$ is sound, it follows from $\mathbf{PA} \vdash \lambda_{\mathbf{PA}} \leftrightarrow \lnot TR(\ulcorner\lambda_{\mathbf{PA}}\urcorner)$ that
$$ \mathfrak{N} \vDash \lambda_{\mathbf{PA}}\leftrightarrow \lnot TR(\ulcorner\lambda_{\mathbf{PA}}\urcorner),$$
and therefore
$$\mathfrak{N}\vDash\lambda_{\mathbf{PA}} \iff \mathfrak{N}\vDash\lnot TR(\ulcorner\lambda_{\mathbf{PA}}\urcorner),$$
so clearly $TR$ does not define "truth" in $\mathfrak{N}$!
The Diagonal Lemma relies on the fact that $\mathbf{PA}$ is "strong enough" to represent all computable relations (this works already for $\mathbf{Q}$, and $\mathbf{Q}\subset\mathbf{PA}$).
Addendum Truth and Proof
I'd like to further clarify your question of why this $TR$ predicate in $\mathbf{PA}$ is about truth and not provability.
In formal languages, sentences are considered just collections of symbols, they don't mean anything in themselves until given an interpretation (semantics). We can also stipulate rules to infer sentences from other sentences purely formally or syntactically (proof theory).
Now we have a property of sentences, "truth", which we want to capture in our formal language, $L_0$ through a sentence, $TR$. We then expect this sentence to "behave" properly. But we can ask this question in two ways:
Semantics. Does the interpretation (the models) respond correctly to this predicate? In this case we want definability:
$$\text{$\varphi$ is true} \iff \mathfrak{N} \vDash TR(\ulcorner\varphi\urcorner).$$
Proof Theory. Does the formal system (with axioms and rules) respond correctly to this predicate? This is representability and is slightly more involved:
- If $\varphi$ is true, then $\mathbf{PA} \vdash TR(\ulcorner\varphi\urcorner).$
- If $\varphi$ is false, then $\mathbf{PA} \vdash \lnot TR(\ulcorner\varphi\urcorner).$
In both cases, the predicate $TR$ is supposed to capture when a sentence is "true", but it does so in different contexts (can the model tell? can we prove it formally?).
Since $\mathbf{PA}$ is a formal system, it doesn't make sense to say "$\varphi$ is true in $\mathbf{PA}$" unless you have in mind a model; $\mathbf{PA}$ is just a collection of axioms (or axioms plus its derivable consequences, depending on the definition).
We could ask the same thing about a provability predicate $PRV$: does the model recognise the predicate appropriately? Does $\mathbf{PA}$ recognise the predicate appropriately? Gödel's Incompleteness Theorems rely on the fact that $PRV$ can be correctly represented in $\mathbf{PA}$ (assuming consistency):
- If $\mathbf{PA} \vdash \varphi$, then $\mathbf{PA} \vdash PRV(\ulcorner\varphi\urcorner)$.
- If $\mathbf{PA} \not\vdash \varphi$, then $\mathbf{PA} \vdash \lnot PRV(\ulcorner\varphi\urcorner)$.
Addendum II Negations inside and outside the system
There are two notions of "negation":
- $\mathfrak{N}\vDash\lnot\varphi$: the sentence $\lnot\varphi$ is true is the model;
- $\mathfrak{N}\not\vDash\varphi$: the sentence $\varphi$ is false in the model.
Naturally, you want the two to be equivalent, so we stipulate:
$$\mathfrak{N}\vDash\lnot\varphi \iff \mathfrak{N}\not\vDash\varphi$$.
This is fine because models are supposed to be complete: they give an interpretation to every sentence of the language, so either $\mathfrak{N}\vDash\varphi$ or $\mathfrak{N}\not\vDash\varphi$; hence $\mathfrak{N}\vDash\varphi$ or $\mathfrak{N}\vDash\lnot\varphi$.
But this is different in a formal system, say $\mathbf{PA}$. Compare:
- $\mathbf{PA}\not\vdash\varphi$: $\mathbf{PA}$ doesn't prove $\varphi$;
- $\mathbf{PA}\vdash\lnot\varphi$: $\mathbf{PA}$ proves $\lnot\varphi$.
As I suppose you know from Gödel's First Incompleteness Theorem, $\mathbf{PA}$ is incomplete: there is a sentence $\gamma_\mathbf{PA}$ such that $\mathbf{PA}\not\vdash\gamma_\mathbf{PA}$ and $\mathbf{PA}\not\vdash\lnot\gamma_\mathbf{PA}$, so it is certainly not the case that
$$\mathbf{PA}\vdash\lnot\varphi \iff \mathbf{PA}\not\vdash\varphi.$$
This is why, when we define representability (say of $PRV$) in $\mathbf{PA}$ we need two clauses (1) and (2) at the end of the previous addendum. Merely stipulating
$$\mathbf{PA}\vdash\varphi \iff \mathbf{PA} \vdash PRV(\ulcorner\varphi\urcorner)$$
would be too weak: $\mathbf{PA}\not\vdash\varphi$ would only imply $\mathbf{PA}\not\vdash PRV(\ulcorner\varphi\urcorner)$ ("if we can't prove $\varphi$, then we can't prove that it can be proved"), which is not the same as $\mathbf{PA} \vdash \lnot PRV(\ulcorner\varphi\urcorner)$ ("if we can't prove $\varphi$, then we can prove that it can't be proved").
One instance where this holds is when $P$ is $\Sigma_1$ and $S$ is a bit stronger than $Q$: a sufficiently strong (consistent, recursively axiomatizable) theory of arithmetic (PA is massive overkill) proves that it proves every true $\Sigma_1$ sentence. I don't recall if $Q$ has this property.
We can get a partial reversal under a correctness assumption on $S$. Suppose $T$ is an appropriately-strong theory which proves "$S$ is sound" *(this is really a scheme: $T$ proves $P(X)\rightarrow X$ for each $X$, where again $P$ is the provability predicate for $S$)*. Then for each $X$, $T$ proves "If $S$ proves $X\rightarrow P(X)$, then $X\iff P(X)$," and $P(X)$ is a $\Sigma_1$ sentence regardless of the complexity of $X$.
- Note that $S$ itself doesn't prove its own soundness, and see Lob's theorem.
Another example occurs whenever $S$ proves its own inconsistency, which can happen even if $S$ is consistent (consider PA + $\neg$Con(PA)). In this case $S$ trivially proves $X\rightarrow P(X)$ for each $X$, since $S$ proves $P(X)$ for each $X$.
- Note that this means that we can't get a general reversal to the situation above.
Meanwhile, when you write
this, as I understand it, is not the same as $(X\rightarrow P(X))$ being provable in $S$,
you're absolutely right. Let's assume PA is sound. We know that the $\Sigma_2$-theory of arithmetic is not recursively enumerable, but the set of PA-provable $\Sigma_2$-sentences is; thus, since by soundness there is no false $\Sigma_2$-sentence which is PA-provable, there must be a true $\Sigma_2$-sentence which is not PA-provable. Letting $X$ be such a sentence, we have (again, by soundness of PA) that PA can't prove $X\rightarrow P(X)$.
Best Answer
There's no need to dive into the proof - this can be directly reduced to a single application of Lob's theorem.
From $P(A)\rightarrow B$ and $P(B)\rightarrow A$, we get $P(A)\wedge P(B)\rightarrow A\wedge B$. But trivially we have $P(A)\wedge P(B)\leftrightarrow P(A\wedge B)$, so in fact from $P(A)\rightarrow B$ and $P(B)\rightarrow A$ we can conclude $$P(A\wedge B)\rightarrow A\wedge B.$$ Now apply Lob's theorem to $A\wedge B$.
In fact, even that's written more inefficiently than it needs to be - just observe that $P(A\wedge B)\rightarrow P(A)\wedge P(B)$ so under the hypotheses gives $B\wedge A$ - but the "backwards" way I've written it above might be more helpful in terms of seeing how we whipped it up: see how large a proposition we need to have a proof of in order to trigger every hypothesis, and then see if that lets us "catch our tail."