The issue is already present in step 1. Consistency is not a strong enough assumption on the theory to guarantee that the provability relation is representable.
For example, if $T$ is taken to be $\text{PA} + \lnot\text{Con}(PA)$, which is not $\omega$-consistent, then we have $T \vdash \text{Pvbl}_T(\phi)$ for all $\phi$, but $T$ is consistent, so for many $\phi$ we also have $T \not \vdash \phi$. Thus we do not have the whole scheme $T \vdash \phi \Leftrightarrow T \vdash \text{Pvbl}_T(\phi)$, as claimed by (1).
The source of confusion may be that you seem to be quantifying over predicates when you write "a predicate $PRV_F$ exists s.t. $F \vdash A \Leftrightarrow F \vdash PRV_F(A)$". In fact we construct a specific predicate $\text{Pvbl}$ in order to derive scheme (1) from the question.
In order to show that $T \vdash A$ implies $T \vdash \text{Pvbl}(A)$, which is part of showing that $\text{Pvbl}$ does represent the provability relation, we rely on $\text{Pvbl}$ being the actual provability predicate, so that we can transform any derivation of $A$ into a derivation of $\text{Pvbl}(A)$.
Similarly, in showing that $T \vdash \text{Pvbl}(A)$ implies $T \vdash A$, we use the $\omega$-consistency of $T$ to know that, if $T \vdash \text{Pvbl}(A)$ then there is actually a natural number $n$ that codes a derivation of $A$ from $T$.
You may be thinking of general results that show that certain theories are able to represent every computable set, or weakly represent every r.e. set. It is true that the representability of the provability relation follows from these general results. But the proofs of these general results also assume that the theory is $\omega$-consistent, or at least $1$-consistent, not just that the theory is consistent.
When we think about theories like ZFC or PA, we often view them foundationally: in particular, we often suppose that they are true. Truth is very strong. Although it's difficult to say exactly what it means for ZFC to be "true" (on the face of it we have to commit to the actual existence of a universe of sets!), some consequences of being true are easy to figure out: true things are consistent, and - since their consistency is true - don't prove that they are inconsistent.
However, this makes things like PA + $\neg$Con(PA) seem mysterious. So how are we to understand these?
The key is to remember that - assuming we work in some appropriate meta-theory - a theory is to be thought of as its class of models. A theory is consistent iff it has a model. So when we say PA + $\neg$Con(PA) is consistent, what we mean is that there are ordered semirings (= models of PA without induction) with some very strong properties.
One of these strong properties is the induction scheme, which can be rephrased model-theoretically as saying that these ordered semirings have no definable proper cuts.
It's very useful down the road to get a good feel for nonstandard models of PA as structures in their own right as oppposed to "incorrect" interpretations of the theory; Kaye's book is a very good source here.
The other is that they satisfy $\neg$Con(PA). This one seems mysterious since we think of $\neg$Con(PA) as asserting a fact on the meta-level. However, remember that the whole point of Goedel's incompleteness theorem in this context is that we can write down a sentence in the language of arithmetic which we externally prove is true iff PA is inconsistent. Post-Goedel, the MRDP theorem showed that we may take this sentence to be of the form "$\mathcal{E}$ has a solution" where $\mathcal{E}$ is a specific Diophantine equation. So $\neg$Con(PA) just means that a certain algebraic behavior occurs.
So models of PA+$\neg$Con(PA) are just ordered semirings with some interesting properties - they have no proper definable cuts, and they have solutions to some Diophantine equations which don't have solutions in $\mathbb{N}$. This demystifies them a lot!
So now let's return to the meaning of the arithmetic sentence we call "$\neg$Con(PA)." In the metatheory, we have some object we call "$\mathbb{N}$" and we prove:
If $T$ is a recursively axiomatizable theory, then $T$ is consistent iff $\mathbb{N}\models$ "$\mathcal{E}_T$ has no solutions."
(Here $\mathcal{E}_T$ is the analogue of $\mathcal{E}$ for $T$; remember that by the MRDP theorem, we're expressing "$\neg$Con(T)" as "$\mathcal{E}_T$ has no solutions" for simplicity.) Note that this claim is specific to $\mathbb{N}$: other ordered semirings, even nice ones!, need not work in place of $\mathbb{N}$. In particular, there will be lots of ordered semirings which our metatheory proves satisfy PA, but for which the claim analogous to the one above fails.
It's worth thinking of an analogous situation in non-foundationally-flavored mathematics. Take a topological space $T$, and let $\pi_1(T)$ and $H_1(T)$ be the fundamental group and the first homology group (with coefficients in $\mathbb{Z}$, say) respectively. Don't pay attention too much to what these are, the point is just that they're both groups coding the behavior of $T$ which are closely related in many ways. I'm thinking of $\pi_1(T)$ as the analogue of $\mathbb{N}$ and $H_1(T)$ as the analogue of a nonstandard model satisfying $\neg$Con(PA), respectively.
Now, the statement "$\pi_1(T)$ is abelian" (here, my analogue of $\neg$Con(PA)) tells us a lot about $T$ (take my word for us). But the statement "$H_1(T)$ is abelian" does not tell us the same things (actually it tells us nothing: $H_1(T)$ is always abelian :P).
We have a group $G$, and some other group $H$ similar to $G$ in lots of ways, and a property $p$; and if $G$ has $p$, we learn something, but if $H$ has $p$ we don't learn that thing. This is exactly what's going on here. It's not the property by itself that carries any meaning, it's the statement that the property holds of a specific object that carries meaning useful to us. We often conflate these two, since there's a clear notion of "truth" for arithmetic sentences, but thinking about it in these terms should demystify theories like PA+$\neg$Con(PA) a bit.
Best Answer
You're forgetting the historical context of the theorem.
Prior to Godel's work, there is no reason to expect "$T\not\vdash\neg Con(T)$" to even make sense. By contrast, $\omega$-consistency makes perfect sense for a theory treating natural numbers. So as a hypothesis, even though it is stronger mathematically it is weaker pedagogically in the sense that it takes much less thought to motivate it. "Every [...] $\omega$-consistent theory is incomplete" is much more easily communicable, to a circa-$1931$ audience, than "Every [...] theory not proving its own inconsistency is incomplete."
If Godel numbering had been introduced significantly prior to the proof of the first incompleteness theorem, then "$T\not\vdash\neg Con(T)$" might have made sense as a hypothesis. But that's not what happened.