[Physics] Why do (can) we impose local gauge invariance

gauge-invariancegauge-theoryquantum-field-theory

Firstly, let me say that I understand that what basically happens in gauge theories is that we keep the unphysical degrees of freedom present but in check, instead of removing them at once, which besides being generally really hard to do would cause further headaches related to Lorentz invariance.

I was trying to follow the line of thought in Ryder's Quantum Field Theory (pag. 90 – 97) to explain elementary gauge theory.

He shows that the Klein-Gordon field theory (because its action is) is invariant by the global transformation

$$
\phi \to e^{i\Lambda}\phi.
$$

However, he then argues that such a transformation would contradict the relativistic causality mantra (because it transforms the internal degrees of freedom in the whole space at the same time) and uses this fact to justify the local gauge invariance construction, which happens by letting $\Lambda \to \Lambda(x)$ and forcing $\delta \mathcal{L}=0$ (since this is, initially, spoiled by the derivatives of the parameter function) by coupling a new field $A^{\mu}$ to the Noether current in a smart way.

Now, my questions:

(1) I don't understand the violation of the causality argument. That is, why would a theoretical manipulation of a physically irrelevant feature (the phase) be physically considered as far as causality goes? Also, as I understand it, the 'localization' of the transformation does not necessarily solves the dilemma as it is naively put by the author. Even if the following imposed condition on $\partial_{\mu}\Lambda$ excludes the forbidden possibilities, it surely doesn't look to do so trivially: for example, it seems to me that, in any case, the variation of the fields outside the lightcone should be zero.

(2) And more importantly, even if I had understood the preceding argument, why does it hint that we should make the local form of the same transformation work (that is, besides that the form works for each point in space-time)? I mean, because the trivially obtained global 'symmetry' is turned down by causality, we go on and invent a new field (that by comparison turns out to be the electromagnetic potential) and artisanally insert it nicely in the lagrangian so that the same symmetry persists locally, now without the causality problem. But aren't we inventing things? Why the next move after the global gauge theory 'fails' is to force a local one (by even creating a gauge field) and not abandoning the theory altoghther? And if we were already going to go all the way to make it work, why maintain the original form of the transformations (that is, to keep the invariance by U(1))?

I understand that possible answers to (2) are ideas like 'well, we tried and it worked', but it is clear that there are more things that I'm missing.

Best Answer

I'm with you. I don't want to be unprofessional, but I find the whole "breaking causality" thing to be complete bogus. I see absolutely no way that the humble Klein Gordon field "breaks causality." In my opinion, just ignore it.

"Why" we consider gauge invariant theories is a good question, and there are many answers. I will approach the question from only one possible direction. When you couple a gauge field to another field, the gauge field must necessarily be coupled to a conserved current. This is a manifestation of Noether's second theorem. That is, if you want your field to be sourced by a conserved current, using a field with gauge invariance is the easiest way to do it.

Let's restrict our discussion by just talking about the simplest possible gauge field: the free electromangentic field, described the by vector potential $A^\mu$. The Lagrangian is $$ \mathcal{L} = -\frac{1}{4} F^{\mu \nu} F_{\mu \nu} , $$ with $$ F_{\mu \nu} = \partial_{\mu}A_{\nu}-\partial_{\nu}A_{\mu} $$ Now we all know that this Lagrangian is invariant if we substitute $$ A_\mu \to A_\mu + \varepsilon \partial_\mu \Lambda $$ for some constant parameter $\varepsilon$. This is just our gauge invariance. In other words, for constant epsilon, $$ \delta \mathcal{L} = 0. $$ Let us now carry out "Noether procedure," in the slick way I like to do it. Let's now make $\varepsilon$ time dependent. That is, $\varepsilon = \varepsilon(t)$. We are still keeping it very small, so second order terms will not matter. Under this variation, you can pull out a piece of paper and find $$ \delta F_{\mu \nu} = \partial_\mu \varepsilon \partial_\nu \Lambda - \partial_\nu \varepsilon \partial_\mu \Lambda $$ and $$ \delta \mathcal{L} = -\frac{1}{2} F^{\mu \nu}(\partial_\mu \varepsilon \partial_\nu \Lambda - \partial_\nu \varepsilon \partial_\mu \Lambda). $$ Performing an integration by parts in our unwritten integral and using the antisymmetry of $F^{\mu \nu}$, this becomes $$ \delta \mathcal{L} = \varepsilon \partial_\mu(F^{\mu \nu} \partial_\nu \Lambda). $$ On solution to the equations of motion, $\delta S$ must be $0$. This is just the principle of least action-- any tiny variation must keep the action stationary. Imposing boundary conditions on $\varepsilon$, we can see that on solutions to the equations of motion, $$ \partial_\mu(F^{\mu \nu} \partial_\nu \Lambda) = 0. $$ In other words, $F^{\mu \nu} \partial_\nu \Lambda$ is a conserved current.

What I have just showed you is simply Noether's first theorem, albeit presented in a somewhat different way than usual. Interestingly, we have found that for any function $\Lambda$ on spacetime, we have a conserved current! We have found infinitely many conserved currents!

Why does no one ever talk about this? Well, it's not the best way to see what's going on, and you'll see why in a second.

Because $\partial_\mu(F^{\mu \nu} \partial_\nu \Lambda) = 0$, we trivially have

$$ \int d^4 x \partial_\mu(F^{\mu \nu} \partial_\nu \Lambda) = 0. $$ on solutions to the equations of motion. Furthermore, we have $F^{\mu \nu} \partial_\mu \partial_\nu \Lambda = 0$ because $F^{\mu \nu}$ is anti symmetric and $\partial_\mu \partial_\nu \Lambda$ is symmetric. Therefore, $$ \int d^4 x (\partial_\nu \Lambda ) (\partial_\mu F^{\mu \nu} )= 0. $$ Integrating by parts, we have $$ -\int d^4 x \Lambda \partial_\mu \partial_\nu F^{\mu \nu} = 0. $$ Let's now duplicate the trick of Noether's first theorem. But instead of thinking about varying $\varepsilon$, let's think about varying $\Lambda$! Because the above integral must be $0$ for any $\Lambda$, we have $$ \partial_\mu \partial_\nu F^{\mu \nu} = 0. $$ This is the conservation equation for the electro magnetic field. Furthermore, it is an example of Noether's second theorem, which we have seen is like "Noether's theorem twice."

You may object that deriving $\partial_\mu \partial_\nu F^{\mu \nu}$ is not very impressive. (It follows directly from the anti symmetry of $F^{\mu \nu}$.) The impressive part will be the next part.

Let's say we want to couple our gauge field to some source current $J^\mu$. $$ \mathcal{L} = -\frac{1}{4} F^{\mu \nu} F_{\mu \nu} + J^\mu A_\mu $$ Let us now suppose that our Lagrangian has the same gauge symmetry as before. I will now show that in that case, $J^\mu$ must be a conserved current.

Varying $$ A_\mu \to A_\mu + \varepsilon \partial_\mu \Lambda $$ we now have $$ \delta \mathcal{L} = \varepsilon \partial_\mu(F^{\mu \nu} \partial_\nu \Lambda) + \varepsilon J^\mu \partial_\mu \Lambda $$ which implies that on solutions to the equation of motion, $$ \partial_\mu(F^{\mu \nu} \partial_\nu \Lambda) + J^\mu \partial_\mu \Lambda = 0. $$ Integrating $0$ over all of space time, we have $$ \int d^4 x \big( \partial_\mu(F^{\mu \nu} \partial_\nu \Lambda) + J^\mu \partial_\mu \Lambda \big) = 0. $$ Integrating by parts, $$ -\int d^4 x \Lambda \big( \partial_\mu \partial_\nu F^{\mu \nu} + \partial_\mu J^\mu \big) = 0. $$ As $\partial_\mu \partial_\nu F^{\mu \nu} = 0$ is trivially true, and as the above equation is true for all $\Lambda$, we see that $$ \partial_\mu J^\mu = 0. $$ So if we want our gauge field to be be sourced by a current, if we make sure that our full Lagrangian is gauge invariant, our current will necessarily be conserved!

So let's take a step back. In general, coupling fields to conserved currents is a desirable thing to do. In the example above, the electromagnetic field was sourced by electric current, although in other examples the current can be less familiar.

If you ask the question, "I want to couple a field to a conserved current, how exactly can I do that?" I have now shown you that if you ensure your field has a gauge symmetry, you cannot fail. This is perhaps the easiest way to see "why" gauge fields are desirable. Furthermore, this arguments takes the element of "cleverness" out of constructing your Lagrangian. A global $U(1)$ charge symmetry gives you a conserved current via Noether's first theorem. If you then want to put that conserved current to work by coupling it to another field (making it interact) while still keeping it conserved, then promoting your global symmetry to a local symmetry will do just the trick!


Edit: Another, perhaps more canonical and straighforward way to derive $\partial_\mu \partial_\nu F^{\mu \nu} = 0$ in a Noether's second theorem way is as follows. Note that for $\delta A_\mu = \partial_\mu \Lambda$, $\delta \mathcal{L} = 0$ offshell for any $\Lambda$. As $$\delta \mathcal{L} = -F^{\mu \nu} \partial_\mu \delta A_\nu = - F^{\mu \nu} \partial_\mu \partial_\nu \Lambda,$$ then we see that we have the off shell equation $$ 0 = \delta_{\rm offshell} S = - \int d^4 x F^{\mu \nu} \partial_\mu \partial_\nu \Lambda. $$ Doing two integration by parts and remembering that $\Lambda$ can be chosen freely, we find the off shell equation $\partial_\mu \partial_\nu F^{\mu \nu} = 0$. As our vacuum equation of motion is $\partial_\mu F^{\mu \nu} = 0$, we see that Noether's second theorem is telling us that a gauge symmetry implies an off shell (i.e., tautological) relationship between the equations of motion. This means that our equations of motion do not provide a unique solution for some initial conditions. But we already could have guessed this-- we are free to perform local gauge transformations however we want, so of course the equations of motion do not provide unique solutions! If we were to supplement our action with an current coupling $J^\mu A_\mu$, then the necessity of this relationship between the equations of motion is what requires $\partial_\mu J^\mu = 0$ when we have gauge symmetry.

Remember: Noether's first theorem says that a (non gauge) symmetry of the action gives a conserved quantity on shell. Noether's second theorem says that a gauge symmetry (which depends on a local function) implies an off shell relationship between the equations of motion. When coupling to a current, this relationship between the equations of motion requires that current must be conserved.