In page 160 of Peskin & Schroeder we are considering an amplitude $\mathcal{M}(k)$ with an external photon as given in equation (5.77)
$$
\sum_{\epsilon}|\epsilon_\mu^*(k)\mathcal{M}^\mu(k)|^2=|\mathcal{M}^1(k)|^2+|\mathcal{M}^2(k)|^2.\tag{5.77}
$$
Then we recall that external photons are given by the term $\int{d^4x}ej^\mu A_\mu$ and hence $\mathcal{M}^\mu(k)$ is given by
$$
\mathcal{M}^\mu(k)=\int{d^4x e^{ikx}\langle f|\bar{\psi}(x)\gamma^\mu\psi(x)|i\rangle}.\tag{5.78}
$$
I don't understand how we come to this conclusion.
Quantum Field Theory – Why Is $\mathcal{M}(k)$ Given by This? (Ward Identity Derivation)
polarizationquantum-field-theoryward-identity
Related Solutions
(1) The completeness relationship for a basis of vectors orthonormal with respect to $\eta_{\mu\nu}$ is
\begin{equation}
\eta_{ij}\epsilon^{(i)}_\mu \epsilon^{(j)}_\nu = \eta_{\mu\nu}
\end{equation}
This normalization convention is picked for Lorentz invariance... I know you said you didn't want that answer but the point is that the normalization of these vectors is a matter of convention and it's best to pick a Lorentz invariant one. One advantage of choosing a L.I. normalization is that we don't need to specify the argument: the $\epsilon$ depend on the momentum, but these normalization conditions do not. The $\eta_{ij}$ provides the minus sign you are missing. Also here you see the basic problem that the gauge symmetry fixes: one of the polarization vectors necessarily has a negative norm.
(2) Having said that, $\epsilon_\mu^{0}$ and $\epsilon_\mu^{3}$ are not valid on shell quantities. They are a convenient mathematical fiction, needed to make an orthonormal basis, which allows things to be written in a nice, Lorentz invariant way. But the external legs of Feynman diagrams must be on shell, and as a result you can only put real honest on shell polarization vectors there, and so you aren't allowed to put $\epsilon^{(0,3)}$ there at all. Put another way, you can't satisfy the equations of motion for the photon with the longitudinal and time like modes, but the LSZ formula picks out the external wave functions that satisfy the classical equations of motion. However, since $k_\mu \mathcal{M}^\mu=0$, you could add $0$ in the funny combination $\left(\epsilon^{(0)}_\mu-\epsilon^{(3)}_\mu\right)\mathcal{M}^\mu$, which you can then add to your other basis vectors to form $\eta_{\mu\nu}\mathcal{M}^{*\mu}\mathcal{M}^\nu$ when you square to form the probablity. If the hypocricy of this angers you, that is a natural reaction, you'll eventually just accept it. (Welcome to gauge theory).
(3) EXCELLENT question. You need the off shell formulation of the Ward identity to give a real answer to this, that's in chapter 7 of P&S. Basically there's more to it than just "replace the external polarization vector by $k_\mu$", you can really show that the parts of the propagator proportional to $k_\mu k_\nu$ never matter even in loops. However, in Yang Mills theories the corresponding statement is not true! So your question is exactly on the money for Yang Mills theories, you get contributions in loops from the longitudinal and timelike modes, and by the optical theorem this taken at face value would lead to the production of unphysical particles. The fix is to add yet more unphysical particles to the theory to cancel out these parts of the loop diagrams, they are called Fadeev Popov ghosts.
After flipping through Peskin and Schroder to answer this question, I have to say that they are proving things in a very roundabout way. It's good that it teaches how to think about Feynman diagrams in a very detailed way... But there are other, less painful ways to prove and think about the Ward Identity (such as using the path integral).
We can write the Fourier transform of $\langle 0|\mathcal{T}A_{\nu}(x)\psi(x_1)\bar\psi(x_2)|0\rangle$ as $$S(p) D_{\nu\alpha}(q) \ e\,\Gamma^{\alpha}(p,q,p+q)S(p+q)$$ where $S(p)$ is the full fermion propagator, $D_{\nu\alpha}(q)$ is the full photon propagator, $\Gamma^{\alpha}(p,q,p+q)$ is the proper vertex function, and an overall momentum conservation delta function has been dropped. Similarly, we can write the Fourier transform of $\langle 0|\mathcal{T}j^{\mu}(x)\psi(x_1)\bar\psi(x_2)|0\rangle $ as $$S(p)V^{\mu}(p,q,p+q)S(p+q)$$ where $V^{\mu}(p,q,p+q)$ is a vertex function that we want to relate to $\Gamma^{\mu}(p,q,p+q).$ The vertex function $V^{\mu}(p,q,p+q)$ enters into the derivation of the Ward-Takahashi identity in Peskin and Schroeder on page 311, but the Ward-Takahashi identity is normally stated in terms of $\Gamma^{\mu}(p,q,p+q)$. Your conundrum (as I understand it) is that according to your analysis of the Schwinger-Dyson equation, $V^{\mu}(p,q,p+q)$ and $\Gamma^{\mu}(p,q,p+q)$ ought to differ by a factor of $Z_3$, but this contradicts the usual statement of the Ward-Takahashi identity where no such factor of $Z_3$ appears. I will argue from the Schwinger-Dyson equation that the longitudinal parts (in $q^{\mu}$) of $V^{\mu}(p,q,p+q)$ and $\Gamma^{\mu}(p,q,p+q)$ are equal, but that the transverse parts differ by the factor of $Z_3$ that you have found. Since only the longitudinal part enters into the Ward-Takahashi identity, the factor of $Z_3$ does not inter into that identity. You may want to review page 246 of Peskin and Schroeder. There they show that only the transverse part of the photon propagator is modified by the self-energy, but that in calculating Feynman diagrams we can simplify the analysis by including the self-energy in the longitudinal part as well because the longitudinal part does not contribute to the Feynman diagrams due to the Ward identity. However, the Schwinger-Dyson equation involves an inverse propagator which does not arise in Feynman diagrams and we need to reevaluate where the self-energy does and does not enter.
Specializing the Schwinger-Dyson equation to the case of $\langle 0|\mathcal{T}A_{\nu}(x)\psi(x_1)\bar\psi(x_2)|0\rangle$ and Fourier transforming, we have $$\tag{1} (D^{(0)\mu\nu}(q))^{-1} D_{\nu\alpha}(q) S(p) \ e\,\Gamma^{\alpha}(p,q,p+q)S(p+q) = \\ e\, S(p)V^{\mu}(p,q,p+q)S(p+q)$$ where $(D^{(0)\mu\nu}(q))^{-1}$ is the inverse of the non-interacting photon propagator. The Dyson equation for the photon propagator is $$\tag{2} D_{\nu\alpha}(q) = D^{(0)}_{\nu\alpha}(q) + D^{(0)}_{\nu\beta}(q) i\Pi^{\beta\gamma}(q)D_{\gamma\alpha}(q) ,$$ so $$\tag{3} (D^{(0)\mu\nu}(q))^{-1} D_{\nu\alpha}(q) = \delta^{\mu}_{\alpha} + i\Pi^{\mu\gamma}(q)D_{\gamma\alpha}(q).$$ Equation (1) then implies $$\tag{4}\Bigl(\delta^{\mu}_{\alpha} + i \Pi^{\mu\gamma}(q)D_{\gamma\alpha}(q)\Bigr) \Gamma^{\alpha}(p,q,p+q) = V^{\mu}(p,q,p+q).$$ The Ward identity forces the longitudinal part of $\Pi^{\mu\gamma}(q)$ to vanish; that is, $q_{\mu}\Pi^{\mu\gamma}(q) = 0.$ Contracting equation (4) with $q_{\mu}$, we therefore have $$\tag{5} q_{\alpha} \Gamma^{\alpha}(p,q,p+q) = q_{\mu}V^{\mu}(p,q,p+q)$$ so no factor of $Z_3$ appears between the longitudinal parts of $\Gamma^{\alpha}(p,q,p+q)$ and $V^{\mu}(p,q,p+q)$ and therefore no factor of $Z_3$ appears in the Ward-Takahashi identity.
The transverse component does not enter the Ward identity for the vertex function but it is useful to consider the transverse component to illustrate where the factor of $Z_3$ does arise. Define $\Pi(q^2)$ by the equation $\Pi^{\mu\nu}(q) = q^2(g^{\mu\nu} - q^{\mu}q^{\nu}/q^2)\Pi(q^2).$ The quantity $(g^{\mu\nu} - q^{\mu}q^{\nu}/q^2)$ can be described as a projection operator that projects out the transverse part of a vector. Contracting equation (4) with $(g_{\nu\mu} - q_{\nu}q_{\mu}/q^2)$ and using the fact that $\Pi^{\mu\gamma}(q)$ is already transverse, we have $$\tag{7} \Bigl(g_{\nu\alpha} - q_{\nu}q_{\alpha}/q^2 + i q^2\Pi(q^2)D^T_{\nu\alpha}(q)\Bigr) \Gamma^{\alpha}(p,q,p+q) =\\ \bigl(g_{\nu\mu} - q_{\nu}q_{\mu}/q^2\bigr)V^{\mu}(p,q,p+q),$$ where $$D^T_{\nu\alpha}(q) = \frac{-i}{q^2 (1-\Pi(q^2))} \bigl(g_{\nu\alpha} - q_{\nu}q_{\alpha}/q^2\bigr)$$ is the transverse part of the photon propagator (see page 246, Peskin and Schroeder). Equation (7) can then be written $$\tag{8} \bigl(g_{\nu\alpha} - q_{\nu}q_{\alpha}/q^2\bigr) \Bigl(1/\bigl(1-\Pi(q^2)\bigr)\Bigr) \Gamma^{\alpha}(p,q,p+q) =\\ \bigl(g_{\nu\mu} - q_{\nu}q_{\mu}/q^2\bigr)V^{\mu}(p,q,p+q).$$ Now consider $q^2$ small enough that $\Pi(q^2)\approx \Pi(0)$ and use the relation (Peskin and Schroeder, page 246) $$Z_3 = \Bigl(1/\bigl(1-\Pi(0)\bigr)\Bigr).$$ We have $$\tag{9} \bigl(g_{\nu\alpha} - q_{\nu}q_{\alpha}/q^2\bigr) Z_3 \Gamma^{\alpha}(p,q,p+q) = \bigl(g_{\nu\mu} - q_{\nu}q_{\mu}/q^2\bigr)V^{\mu}(p,q,p+q).$$ So we see that the transverse parts of $V^{\mu}(p,q,p+q)$ and $\Gamma^{\mu}(p,q,p+q)$ differ by a factor of $Z_3.$
Best Answer
It is due to the simplest application of relativistic perturbation theory. The S-matrix is defined with $H_I$ as interaction Hamiltonian
$$S= T\exp\left(-i \int_{-\infty}^\infty \hat{H_I} dt \right)$$
where the symbol T stands for time ordering to be applied to the development of the exponential in a power series. But as we will only deal with 1. order of perturbation theory here, the time ordering does not come into effect. So the S-matrix is up to first order:
$$ S = id - i\int_{-\infty}^\infty \hat{H_I} dt$$
Now in QED the interaction operator is just the volume integral of the electron current $j^\mu$ and the electromagnetic 4-potential operator $\hat{A}_\mu$
$$ \hat{H_I} = \int d^3x e \hat{j}^\mu \hat{A}_\mu$$
so plugging this into the precedent expression gives:
$$S = id - i \int d^4x e \hat{j}^\mu \hat{A}_\mu$$
Now we only need to plug in the electromagnetic potential operator into the last expression ($c$ and $c^\dagger$ are the annihilation and creation operators of photons):
$$\hat{A}_\mu = \sum_{\epsilon=1,2}\sum_p [c_{pe} \epsilon_\mu e^{-ipx} + c^\dagger_{pe} \epsilon^\ast_\mu e^{ipx} ]$$
$\epsilon_\mu$ are components of the polarisation vector of the photon.
Considering an emission of a photon means that in the final state the is an additional photon with 4-momentum k, we take the matrix element of the S-matrix $\langle 1_k f| S| 0i\rangle $. Here $1_k$ means a final state with 1 photon of 4-momentum k and a transition of the participating electron from initial state $i$ to final state $f$ (note that the corresponding matrix element $\langle 1|c_{pe}|0\rangle =0$):
$$\langle 1_k f|S| 0i\rangle = \langle 1_k f| 0i\rangle - i \int d^4x e \langle f|j^\mu|i\rangle \sum_{\epsilon=1,2}\sum_p \langle 1_k| c^\dagger_{pe}| 0\rangle \epsilon^\ast_\mu e^{ipx} = - i\int d^4x \sum_{\epsilon=1,2} e \langle f|\bar{\psi}\gamma^\mu \psi|i\rangle e^{ikx} \epsilon^\ast_\mu$$
by using $$\langle 1_k| c^\ast_{pe}| 0\rangle =\delta_{pk}$$ and the orthogonality of Fock states ($\langle 1| 0\rangle =0$)
By defining $$\cal{M}^\mu(k) = \int d^4x \sum_{\epsilon=1,2}\langle f|\bar{\psi}\gamma^\mu \psi|i\rangle e^{ikx} $$
we get:
$$\langle 1_k f|S| 0i\rangle = -i e\sum_{\epsilon=1,2}\epsilon^\ast_\mu \cal{M}^\mu(k)$$
so the S-matrix element between a transition of an electron state $i$ to and electron state $f$ and an emitted photon of the 4-momentum k and polarisation state $\epsilon$ is given as above indicated.
Finally if we consider the transition probability we would take the modulus of the $S$-matrix element:
$$| \langle 1_k f|S| 0i\rangle |^2 =e^2 \sum_{\epsilon=1,2}|\epsilon^\ast_\mu \cal{M}^\mu(k)|^2$$
because the mixed term cancels out due the orthogonality of the polarisation states. Well, we get an additional $e^2$, but actually for the derivation flow, I guess, it does not matter.
Important remark: the emission of a single real photon is actually not possible for free electrons (only possible for bound electrons), so we have to imagine the states $i$ and $f$ of the participating electron as bound.