The two theories, namely the ``gradient model'' $\partial_\mu\pi \bar{\Psi}\gamma^5\gamma^\mu\Psi$ and the Yukawa model $g\pi\bar{\Psi}\gamma^5\Psi$ (both with a massive $\Psi$), are definitely not equivalent. They have different symmetries, spectrum and scattering amplitudes, hence are physically distinct theories.
The main mistake that you (the OP) are doing is using the free equations of motion for the fermions, but that's ok only for the external legs and not for the virtual ones that enter e.g. in the one-loop calculation of the $\pi$ mass, or as I'll show below in a scattering amplitude with an intermediate virtual $\psi$ exchanged. (The mistake that Cosmas Zachos was doing in his earlier answer and that in part is still doing in the marginally improved answer is explained in my comments to his answer, I will not repeat it here).
The gradient model is indeed invariant under $\pi\rightarrow \pi+const$ which clearly forbids a mass term for $\pi$. This isn't the case for the Yukawa model where a bare mass is needed to remove the quadratic divergent mass generated by the fermion loops. A physical pole mass is therefore generically non-zero, barring fine-tuning.
More importantly, Goldstone bosons (GB's) aren't just massless particles, they have various special features. For example, soft GBs (that is the limit of vanishing $\pi$-momentum) give vanishing scattering amplitudes (the so-called Adler zero condition). This is realized for the gradient theory but not for the Yukawa theory.
Let's see this in more detail looking at a physical scattering amplitude $\pi\Psi\rightarrow \pi\Psi$.
For the Yukawa theory one has
$$
M^{Yukawa}_{\pi\Psi\rightarrow\pi\Psi}=-g^2 \left[\bar{u}(p_2^\prime)\frac{i(\gamma_\alpha p_1^\alpha+\gamma_\alpha p_2^\alpha-m)}{s-m^2+i\epsilon}u(p_2)+\mbox{crossed diag.}\right]
$$
for $\pi(p_1)\Psi(p_2)\rightarrow\pi(p_1^\prime)\Psi(p_2^\prime)$. The $\gamma^5$ have been moved around and simplified with the numerator of the fermion propagator, i.e. $\gamma^5i(\gamma_\alpha p_1^\alpha+\gamma_\alpha p_2^\alpha+m)\gamma^5=-i(\gamma_\alpha p_1^\alpha+\gamma_\alpha p_2^\alpha-m)$. We could have simplified the numerator using $\gamma_\alpha p_2^\alpha u(p_2)=m u(p_2)$, where $m$ is the fermion mass, but it's more convenient this form in the following. There is an s-channel contribution, explicitly displayed, along with a crossed diagram that we do not display explicitly.
(disclaimer: I am doing this calculation by hand on an Ipad, I hope it is not grossly incorrect :-), although factors of 2 and minus signs are most likely off)
This $M^{Yukawa}_{\pi\Psi\rightarrow\pi\Psi}$ doesn't vanish for $p_1\rightarrow 0$ because, even though the numerator goes to zero (namely $\gamma_\alpha p_1^\alpha+\gamma_\alpha p_2^\alpha-m)u(p_2)=\gamma_\alpha p_1^\alpha u(p_2)\rightarrow 0$), so does the denominator at the same rate ($s-m^2=2p_{1\alpha} p_2^\alpha\rightarrow0$; here I am assuming that we have tuned the spectrum to be the same, that is the $\pi$ mass in the Yukawa model has been tuned to zero by hand, otherwise the numerator wouldn't even vanish and the comparison between the two models would make no sense).
On the other hand, for the gradient theory we get
$$
M^{gradient}_{\pi\Psi\rightarrow\pi\Psi}=\frac{1}{f^2}\left[\bar{u}(p_2^\prime)\frac{i(\gamma_\alpha p_1^\alpha+\gamma_\alpha p_2^\alpha-m)^3}{s-m^2+i\epsilon}u(p_2)+\mbox{crossed diag.}\right]\rightarrow 0
$$
which is not only different (hence the two theories are physically distinct, period) but it gives a vanishing amplitude in the GB soft limit $p_1\rightarrow 0$ since the numerator can be written as $i\bar{u}(p_2^\prime)(\gamma_\alpha p_1^\alpha+\gamma_\alpha p_2^\alpha-m)^3u(p_2)=i\bar{u}(p_2^\prime)\gamma_\alpha p_1^{\prime\alpha}(\gamma_\beta p_1^\beta+\gamma_\beta p_2^\beta-m)(\gamma_\gamma p_1^\gamma)u(p_2)$, and for momentum conservation $p_1^\prime\rightarrow 0$ too.
The takeaway message is: the two models are distinct physically and mathematically. The gradient theory describes a GB whereas the Yukawa theory describe a scalar with a mass tuned to be zero.
Extra edits
I have finally found some times to add a last remark that I mentioned in the comments but it is actually worth to report in the full answer. It is also related to the answer by @Cosmas Zachos.
Having established that the two theories are different, one may wonder how much different and what is the relation between the two, given that the simple use of the equations of motion by the OP was flawed.
The answer is very simple: the two theories differ at the non-linear $\pi$-level, starting from the quadratic order. In particular, the claim is that the theory given by
$$
\mathcal{L}_{new}=\bar{\Psi}(i\gamma^\alpha \partial_\alpha-m e^{-2i\gamma^5\pi/f})\Psi+\frac{1}{2}(\partial_\mu\pi)^2\qquad f\equiv 2m/g\,,
$$
which differs from $\mathcal{L}_{Yukawa}=\bar{\Psi}(i\gamma^\alpha \partial_\alpha-m)\Psi+ig\pi \bar{\Psi}\gamma^5\Psi+\frac{1}{2}(\partial_\mu\pi)^2$ starting from $o(\pi^2)$, is in fact equivalent to the gradient theory
$$
\mathcal{L}_{gradient}=\frac{1}{2}(\partial_\mu\pi)^2+\bar{\Psi}(i\gamma^\alpha \partial_\alpha-m)\Psi+\frac{1}{f}\partial_\mu\pi \bar{\Psi}\gamma^5\gamma^\mu\Psi\,.
$$
Indeed, it's enough to perform the field redefinition $\Psi\rightarrow e^{i\gamma^5 \pi/f}\Psi$ to move the $\pi$ from the non-derivative term to the gradient coming from the $\Psi$-kinetic term.
As ultimate check, let's see the behavior under the soft limit $p_1\rightarrow 0$. The contributions from two linear-$\pi$ vertex insertions is like in the Yukawa theory, but now there is also a contact term coming from expanding the exponential, $\frac{2m}{f^2}\pi^2\bar{\Psi}\Psi$, i.e.
$$
M^{new}_{\pi\Psi\rightarrow\pi\Psi}=M^{Yukawa}_{\pi\Psi\rightarrow\pi\Psi}+i\frac{4m}{f^2}\bar{u}(p_2^\prime)u(p_2)\,.
$$
Now, in the soft limit $p_1\rightarrow 0$, we have $p_2^\prime \rightarrow p_2$ hence the Yukawa terms give
$$
M^{Yukawa}_{p_1\rightarrow 0}=-g^2 \left[\bar{u}(p_2)\frac{i\gamma_\alpha p_1^\alpha}{2 p_1 p_2}u(p_2)+\mbox{crossed diag.}\right]=-2ig^2
$$
where I have used that $\bar{u}(p)\gamma^\alpha u(p)=2p^\alpha$. On the other hand, the new contact term in from $\mathcal{L}^{new}$ gives
$$
i\frac{4m}{f^2}\bar{u}(p_2)u(p_2)=i\frac{8m^2}{f^2}=2i g^2\,,
$$
from $\bar{u}(p)u(p)=2m$ (no sum over the two spin orientations, we are considering definite polarizations). Summing the two contributions we see that they vanish each other out, in agreement with the Adler-zero condition.
But again, the equivalence with the gradient theory is achieved only after modifying the theory at the $o(\pi^2)$ level in the way shown above, which corresponds to render the $\pi$ a GB. The Yukawa coupling alone instead is for non-GB particles.
Best Answer
For a singlet $W_\mu$, obviously yes: you are really writing the same thing, and it is not relevant whether you consider the chiral projector as part of the fermions or a feature of the chiral gauge field acting on the same fermions, and automatically excluding $R\psi$ from its couplings, as it projects those out. Everybody may understand this in connecting kinetic terms to gauge coupling terms.
The important thing, if you were referring to the Yukawa coupling of the P&S toy model is to also understand the point of their (21.18): the Higgs couples to both $R\psi$ and $L\psi$, even though the W gauge fields don't! (The B field does.) The go-to review for particle physicists is, of course, the PDG.
Because the real-world structure of the Yukawa couplings, the real heart of the SM, despite what many teacher fail to emphasize, is very unfriendly to your second point of view, one almost never uses this "unified" language (2) for the standard model (while keeping it in the back of their minds with the suitable caveats).
Switching off the g' couplings, which have a screwed up chirality, you might still unwisely choose to talk about non-chiral isodoublet fields $$ \Psi\equiv \begin{pmatrix} u\\ d\end{pmatrix}, $$ and (unwisely) organize the terms as per your implied second question.
The Yukawa-free terms you wrote only, may, indeed, be written in terms of doublets, $$ i(\overline{\Psi_L} \gamma^\mu D_\mu\Psi_L+\overline{\Psi_R}\gamma^\mu\partial_\mu\Psi_R) =i(\overline{\Psi_L}(\gamma^\mu \partial_\mu +ig\gamma^\mu {\mathbb W}_\mu) \Psi_L+\overline{\Psi_R}\gamma^\mu\partial_\mu\Psi_R)\\ =i(\overline{\Psi}(\gamma^\mu \partial_\mu +ig\gamma^\mu {\mathbb W}_\mu L) \Psi, $$ where now ${\mathbb W}_\mu=\vec W_\mu\cdot \vec \tau /2 $ are matrices acting on (left) doublets. It is evident you may attach the L projector to them, and skip chiral projectors off the fermions, and appear you have simplified things.
Far from it. You have introduced two symmetries, evident from your separate chiral pieces: a gauged $SU(2)_L$ which acts on the L fermions, and a global $SU(2)_R$, completely independent of it (their generators commute between the two groups), so you may perform rotations in each completely independently, and still have full invariance. So, indeed, you may simplify your structure by attaching a Left projector to the gauge field matrix, and drop the subscripts off the fermion doublets, which might appear simpler.
Far from it. The simplification is restricted to this sector, and hence illusory. Crucially, the $SU(2)_R$ symmetry is absent in the Yukawa sector: I defy you to try... You'll fail. This sector is what makes the SM admirable: it ultimately gives L-gauge-invariant masses to a theory without $SU(2)_R$ invariance!