Tarski's theorem is about the undefinabilty of a truth predicate in a model.
To take a concrete example, let $\mathfrak{N}$ be the standard model for $\mathbf{PA}$. Then a relation $R(x_1,...,x_k)$ is definable in $\mathfrak{N}$ if there is a (first order) formula $\varphi(x_1,...,x_k)$ in $L_0$ such that, for all $n_1,..., n_k \in \mathbb{N}$,
$$ R(n_1,...,n_k) \iff \mathfrak{N} \vDash \varphi(\bar{n}_1,...,\bar{n}_k).$$
Then Tarski's Theorem says that there is no formula $TR$ in $L_0$ such that
$$\text{$\varphi$ is true} \iff \mathfrak{N} \vDash TR(\ulcorner\varphi\urcorner).$$
This is a semantic notion: for models. But it follows from the impossibility of a formula $TR$ such that $\mathbf{PA} \vdash \varphi \leftrightarrow TR(\ulcorner\varphi\urcorner)$ by the Diagonal Lemma:
Diagonal Lemma (for PA) If $\psi(x)$ is a formula of $L_0$ with $x$ free, then there is a sentence $\varphi$ in $L_0$ such that
$$\mathbf{PA} \vdash \varphi \leftrightarrow \psi(\ulcorner\varphi\urcorner).$$
We can use $\lnot TR(x)$ instead of $\psi(x)$ above to get a sentence $\lambda_{\mathbf{PA}}$ (for liar!) such that
$$\mathbf{PA} \vdash \lambda_{\mathbf{PA}} \leftrightarrow \lnot TR(\ulcorner\lambda_{\mathbf{PA}}\urcorner).$$
But $TR$ is supposed to satisfy
$$\mathbf{PA} \vdash \lambda_{\mathbf{PA}} \leftrightarrow TR(\ulcorner\lambda_{\mathbf{PA}}\urcorner),$$
which clearly yields a contradiction with the statement above.
Hence, assuming $\mathbf{PA}$ is consistent, there is no formula $TR$ in $L_0$ such that $\mathbf{PA} \vdash \varphi \leftrightarrow TR(\ulcorner\varphi\urcorner)$.
Furthermore, if $\mathbf{PA}$ is sound, it follows from $\mathbf{PA} \vdash \lambda_{\mathbf{PA}} \leftrightarrow \lnot TR(\ulcorner\lambda_{\mathbf{PA}}\urcorner)$ that
$$ \mathfrak{N} \vDash \lambda_{\mathbf{PA}}\leftrightarrow \lnot TR(\ulcorner\lambda_{\mathbf{PA}}\urcorner),$$
and therefore
$$\mathfrak{N}\vDash\lambda_{\mathbf{PA}} \iff \mathfrak{N}\vDash\lnot TR(\ulcorner\lambda_{\mathbf{PA}}\urcorner),$$
so clearly $TR$ does not define "truth" in $\mathfrak{N}$!
The Diagonal Lemma relies on the fact that $\mathbf{PA}$ is "strong enough" to represent all computable relations (this works already for $\mathbf{Q}$, and $\mathbf{Q}\subset\mathbf{PA}$).
Addendum Truth and Proof
I'd like to further clarify your question of why this $TR$ predicate in $\mathbf{PA}$ is about truth and not provability.
In formal languages, sentences are considered just collections of symbols, they don't mean anything in themselves until given an interpretation (semantics). We can also stipulate rules to infer sentences from other sentences purely formally or syntactically (proof theory).
Now we have a property of sentences, "truth", which we want to capture in our formal language, $L_0$ through a sentence, $TR$. We then expect this sentence to "behave" properly. But we can ask this question in two ways:
Semantics. Does the interpretation (the models) respond correctly to this predicate? In this case we want definability:
$$\text{$\varphi$ is true} \iff \mathfrak{N} \vDash TR(\ulcorner\varphi\urcorner).$$
Proof Theory. Does the formal system (with axioms and rules) respond correctly to this predicate? This is representability and is slightly more involved:
- If $\varphi$ is true, then $\mathbf{PA} \vdash TR(\ulcorner\varphi\urcorner).$
- If $\varphi$ is false, then $\mathbf{PA} \vdash \lnot TR(\ulcorner\varphi\urcorner).$
In both cases, the predicate $TR$ is supposed to capture when a sentence is "true", but it does so in different contexts (can the model tell? can we prove it formally?).
Since $\mathbf{PA}$ is a formal system, it doesn't make sense to say "$\varphi$ is true in $\mathbf{PA}$" unless you have in mind a model; $\mathbf{PA}$ is just a collection of axioms (or axioms plus its derivable consequences, depending on the definition).
We could ask the same thing about a provability predicate $PRV$: does the model recognise the predicate appropriately? Does $\mathbf{PA}$ recognise the predicate appropriately? Gödel's Incompleteness Theorems rely on the fact that $PRV$ can be correctly represented in $\mathbf{PA}$ (assuming consistency):
- If $\mathbf{PA} \vdash \varphi$, then $\mathbf{PA} \vdash PRV(\ulcorner\varphi\urcorner)$.
- If $\mathbf{PA} \not\vdash \varphi$, then $\mathbf{PA} \vdash \lnot PRV(\ulcorner\varphi\urcorner)$.
Addendum II Negations inside and outside the system
There are two notions of "negation":
- $\mathfrak{N}\vDash\lnot\varphi$: the sentence $\lnot\varphi$ is true is the model;
- $\mathfrak{N}\not\vDash\varphi$: the sentence $\varphi$ is false in the model.
Naturally, you want the two to be equivalent, so we stipulate:
$$\mathfrak{N}\vDash\lnot\varphi \iff \mathfrak{N}\not\vDash\varphi$$.
This is fine because models are supposed to be complete: they give an interpretation to every sentence of the language, so either $\mathfrak{N}\vDash\varphi$ or $\mathfrak{N}\not\vDash\varphi$; hence $\mathfrak{N}\vDash\varphi$ or $\mathfrak{N}\vDash\lnot\varphi$.
But this is different in a formal system, say $\mathbf{PA}$. Compare:
- $\mathbf{PA}\not\vdash\varphi$: $\mathbf{PA}$ doesn't prove $\varphi$;
- $\mathbf{PA}\vdash\lnot\varphi$: $\mathbf{PA}$ proves $\lnot\varphi$.
As I suppose you know from Gödel's First Incompleteness Theorem, $\mathbf{PA}$ is incomplete: there is a sentence $\gamma_\mathbf{PA}$ such that $\mathbf{PA}\not\vdash\gamma_\mathbf{PA}$ and $\mathbf{PA}\not\vdash\lnot\gamma_\mathbf{PA}$, so it is certainly not the case that
$$\mathbf{PA}\vdash\lnot\varphi \iff \mathbf{PA}\not\vdash\varphi.$$
This is why, when we define representability (say of $PRV$) in $\mathbf{PA}$ we need two clauses (1) and (2) at the end of the previous addendum. Merely stipulating
$$\mathbf{PA}\vdash\varphi \iff \mathbf{PA} \vdash PRV(\ulcorner\varphi\urcorner)$$
would be too weak: $\mathbf{PA}\not\vdash\varphi$ would only imply $\mathbf{PA}\not\vdash PRV(\ulcorner\varphi\urcorner)$ ("if we can't prove $\varphi$, then we can't prove that it can be proved"), which is not the same as $\mathbf{PA} \vdash \lnot PRV(\ulcorner\varphi\urcorner)$ ("if we can't prove $\varphi$, then we can prove that it can't be proved").
Best Answer
No. If the set of theorems of $\sf PA$ were provably definable by the formula $\phi(x),$ then we could use the diagonalization lemma to find a sentence $S$ such that $ \mathsf{PA}\vdash S\leftrightarrow \lnot \phi(\ulcorner S\urcorner).$ But then we have $$\ulcorner S\urcorner \in \operatorname{Th}(\mathsf{PA}) \iff \mathsf{PA}\vdash S \iff \mathsf{PA}\vdash \lnot\phi(\ulcorner S\urcorner) \iff \ulcorner S\urcorner \notin \operatorname{Th}(\mathsf{PA}).$$
The example you give just means that $\sf PA$ can prove all positive instances of membership in $\operatorname{Th}(\sf PA) $, not necessarily the negative ones.
In fact, the provably definable sets are exactly the computable sets, whereas $\operatorname{Th}(\sf PA)$ is a set that is computably enumerable but not computable.