[Physics] Deriving Yang-Mills Equations

gauge-theoryhomework-and-exerciseslagrangian-formalismvariational-principleyang-mills

On the same spirit of this unanswered question I am proposing this question which I have been trying for some time now. Here I'm working with dimension $n = 4$ (identifying $\mathbb H = \mathbb R^4$) considering the principal $SU(2)$-bundle with $\rho : SU(2) \to GL(2, \mathbb C)$ being the adjoint representation $ad_g (A) = g^{-1} A g$, for all $g \in SU(2)$ and $A \in \mathfrak {su}(2)$. Moreover, the gauge potential is written (in local coordinates) as $\mathcal A = \mathcal A_{\alpha} dx^\alpha$, $\alpha = 1,2,3,4$ and its gauge field strength (curvature) is given by $$\mathcal F = d\mathcal A + \frac{1}{2}[\mathcal A, \mathcal A] = \frac{1}{2} \mathcal F_{\alpha\beta} dx^\alpha \wedge dx^\beta$$ where (after some lengthy calculation) $$\mathcal F_{\alpha\beta} = \partial_\alpha \mathcal A_\beta – \partial_\beta\mathcal A_\alpha + [\mathcal A_\alpha, \mathcal A_\beta]\ \ , \ \ \beta= 1,2,3,4$$
where $\partial_\alpha = \frac{\partial}{\partial x^\alpha}$.

Question:

Derive the Euler-Lagrange equations of the Yang-Mills functional given by $$\mathcal {YM} (\mathcal A) = \frac{1}{4}\int_{\mathbb R^4} \|\mathcal F\|^2 d(\bf vol_{\mathbb R^4})$$
such equations are called in the physics literature Yang-Mills equations $$\ast d^{\mathcal A}(\ast \mathcal F) = \sum_{\alpha = 1}^4 (\partial_\alpha \mathcal F_{\alpha\beta} + [\mathcal A_\alpha, \mathcal F_{\alpha\beta}]) = 0$$ where $\ast$ is the Hodge dual operator, $d^\mathcal A$ is the covariant exterior derivative.

Attempt: Since the space of gauge potentials is an affine space we may consider the variations $\mathcal A + t\mathcal B$ (family of 1-parameter of gauge potentials) thus

$$\begin{aligned}\frac{d}{dt} \Big(\mathcal {YM}(\mathcal A + t \mathcal B)\Big)\Big|_{t= 0} &= \frac{1}{4}\int_{\mathbb R^4} \frac{d}{dt} \Big(\|\mathcal F\|^2\Big)\Big|_{t = 0} d (\bf vol_{\mathbb R^4})\\ &=\frac{1}{2} \int_{\mathbb R^4} \|\mathcal F\| \frac{\left\langle \mathcal F, \frac{d}{dt} (\mathcal F)\Big|_{t =0}\right\rangle}{\|\mathcal F\|} d(\bf vol_{\mathbb R^4})\\&=\frac{1}{2} \int_{\mathbb R^4} \left\langle \mathcal F, \frac{d}{dt} (\mathcal F)\Big|_{t=0}\right\rangle d(\bf vol_{\mathbb R^4})\end{aligned}$$ where

$$\begin{aligned} \mathcal F &= \frac{1}{2} \mathcal F_{\alpha \beta} dx^\alpha \wedge dx^\beta \\&= \frac{1}{2}\Bigg( \partial_\alpha (\mathcal A_\beta + t \mathcal B_\beta) – \partial_\beta (\mathcal A_\alpha + t \mathcal B_\alpha) + [\mathcal A_\alpha, \mathcal A_\beta] \\&+ t [\mathcal A_\alpha, \mathcal B_\beta] + t [\mathcal B_\alpha, \mathcal A_\beta] + t^2 [\mathcal B_\alpha, \mathcal B_\beta]\Bigg) dx^\alpha \wedge dx^\beta\end{aligned}$$ by linearity we then obtain

$$\frac{d}{dt} (\mathcal F)\Big|_{t = 0} = \Bigg(\partial_\alpha \mathcal B_\beta – \partial_\beta \mathcal B_\alpha + [\mathcal A _\alpha, \mathcal B_\beta] + [\mathcal B_\alpha, \mathcal A_\beta]\Bigg) dx^\alpha \wedge dx^\beta$$
Now

$$\begin{aligned}\frac{d}{dt}\Big(\mathcal {YM}(\mathcal A + t \mathcal B)\Big)\Big|_{t = 0} &=\frac{1}{2} \int_{\mathbb R^4} \left\langle \mathcal F, \frac{d}{dt} (\mathcal F)\Big|_{t=0}\right\rangle d(\bf vol_{\mathbb R^4}) \\&= \frac{1}{2}\int_{\mathbb R^4} \mathcal F_{\alpha\beta} (\partial_\alpha \mathcal B_\beta – \partial_\beta \mathcal B_\alpha + [\mathcal A _\alpha, \mathcal B_\beta] + [\mathcal B_\alpha, \mathcal A_\beta])d(\bf vol_{\mathbb R^4}) \end{aligned}$$
Next step should be integration by parts. Then using the fact (Divergence Theorem)

$$\int_{\mathbb R^4} \partial_i (f) g dV_g = – \int_{\mathbb R^4} f \partial_i (g) dV_g $$ and that $\mathcal F_{\beta\alpha} = – \mathcal F_{\alpha\beta}$ we have

$$\begin{aligned}\delta\mathcal {YM}(A)&=\frac{1}{2} \int_{\mathbb R^4} \left\langle \mathcal F, \frac{d}{dt} (\mathcal F)\Big|_{t=0}\right\rangle d(\bf {vol}_{\mathbb R^4}) \\&= \frac{1}{2}\int_{\mathbb R^4} \mathcal F_{\alpha\beta} (\partial_\alpha \mathcal B_\beta – \partial_\beta \mathcal B_\alpha + [\mathcal A _\alpha, \mathcal B_\beta] + [\mathcal B_\alpha, \mathcal A_\beta])d(\bf {vol}_{\mathbb R^4})\\&= \frac{1}{2}\left(\int_{\mathbb R^4} \mathcal F_{\alpha\beta} (\partial_\alpha \mathcal B_\beta) d(\bf {vol}_{\mathbb R^4}) – \int_{\mathbb R^4}\mathcal F_{\alpha\beta}(\partial_\beta \mathcal B_\alpha) d(\bf {vol}_{\mathbb R^4})\right) \\&+ \frac{1}{2}\left(\int_{\mathbb R^4} \mathcal F_{\alpha\beta}[\mathcal A_\alpha, \mathcal B_\beta] + \mathcal F_{\alpha\beta}[\mathcal B_\alpha, \mathcal A_\beta])d(\bf {vol}_{\mathbb R^4}) \right)\\&= -\frac{1}{2}\left(\int_{\mathbb R^4} 2\mathcal B_\beta (\partial_\alpha \mathcal F_{\alpha\beta} d(\bf {vol}_{\mathbb R^4})\right) + \frac{1}{2}\left(\int_{\mathbb R^4} \mathcal F_{\alpha\beta}[\mathcal A_\alpha, \mathcal B_\beta] + \mathcal F_{\alpha\beta}[\mathcal B_\alpha, \mathcal A_\beta]d(\bf {vol}_{\mathbb R^4}) \right)\\&=-\int_{\mathbb R^4} \mathcal B_\beta (\partial_\alpha \mathcal F_{\alpha\beta} d(\bf {vol}_{\mathbb R^4}) \\&+ \frac{1}{2}\left(\int_{\mathbb R^4} \mathcal F_{\alpha\beta}[\mathcal A_\alpha, \mathcal B_\beta] + \mathcal F_{\alpha\beta}[\mathcal B_\alpha, \mathcal A_\beta])d(\bf {vol}_{\mathbb R^4}) \right)\\&=-\int_{\mathbb R^4} \mathcal B_\beta (\partial_\alpha \mathcal F_{\alpha\beta} d(\bf {vol}_{\mathbb R^4}) + \int_{\mathbb R^4} \mathcal F_{\alpha\beta}[\mathcal A_\alpha, \mathcal B_\beta] d(\bf {vol}_{\mathbb R^4})\end{aligned}$$

I need only show that

$$\int_{\mathbb R^4} \mathcal F_{\alpha\beta}[\mathcal A_\alpha, \mathcal B_\beta] d(\bf {vol}_{\mathbb R^4}) = – \int_{\mathbb R^4} \mathcal B_\beta [\mathcal A_\alpha,\mathcal F_{\alpha\beta}]d (\bf {vol}_{\mathbb R^4}) $$

Where do I want to get?
$$\begin{aligned}\delta (\mathcal {YM}) (\mathcal A) = – \int_{\mathbb R^4} \mathcal B_\alpha (\partial_\alpha \mathcal F_{\alpha\beta} + [\mathcal A_\alpha, \mathcal F_{\alpha \beta} ]) d(\bf vol_{\mathbb R^4})\end{aligned}$$ then the stationary points satisfy

$$\sum_{\alpha = 1}^4 (\partial_\alpha \mathcal F_{\alpha\beta} + [\mathcal A_\alpha, \mathcal F_{\alpha\beta}]) = 0$$

Best Answer

Let me discuss the mathematical precise derivation of the Yang-Mills equation in full generality. For this, let us first of all fix some notation:

  1. Let $P$ be a principal bundle over some (compact, oriented) pseudo-Riemannian manifold $(\mathcal{M},g)$ (spacetime) with structure group given by a compact (and finite-dimensional) Lie group $G$. The corresponding Lie algebra is denoted by $\mathfrak{g}$.
  2. Furthermore, let $\langle\cdot,\cdot\rangle_{\mathfrak{g}}$ be an $\mathrm{Ad}$-invariant inner product on $\mathfrak{g}$, or more generally, a non-degenerate $\mathrm{Ad}$-invariant and symmetric bilinaer form. For example, if $G$ is simple, then this is usually nothing else then a (negative multiple) of the Killing form.

Now, in order to formulate the action of Yang-Mills theory, we first of all take a connection $1$-form $A\in\Omega^{1}(P,\mathfrak{g})$, which corresponds to a gauge field in physics terminology. The corresponding curvature $F^{A}\in\Omega^{2}(P,\mathfrak{g})$ is defined by $$F^{A}=\mathrm{d}A+\frac{1}{2}[A\wedge A].$$ Since we want to define an integral over the spacetime $\mathcal{M}$ and not $P$, we need to translate the curvature to a field defined on $\mathcal{M}$. This can be done in the following way: It is a general mathematical fact that there is the following isomorphism: $$\Omega^{k}_{\mathrm{hor}}(P,\mathfrak{g})^{\mathrm{Ad}}\cong\Omega^{k}(\mathcal{M},\mathrm{Ad}(P)),$$ where $\Omega^{k}_{\mathrm{hor}}(P,\mathfrak{g})^{\mathrm{Ad}}$ is some subset of $\Omega^{k}(P,\mathfrak{g})$ of forms satisfying extra properties (which are fulfilled by $F^{A}$) and where $\mathrm{Ad}(P)=P\times_{\mathrm{Ad}}\mathfrak{g}$ denotes the "adjoint bundle", which is a vector bundle defined on some certain quotient of $P\times\mathfrak{g}$. (In mathematics, this is a particular case of so-called "associated vector bundles", which can be defined for every principal bundle and every representation $(\rho,V)$ on $G$). As a second ingredient, we have to define an inner product $$\langle\cdot,\cdot\rangle_{\mathrm{Ad}(P)}:\Omega^{k}(\mathcal{M},\mathrm{Ad}(P))\times\Omega^{k}(\mathcal{M},\mathrm{Ad}(P))\to C^{\infty}(\mathcal{M}).$$ This can be done in the obvious way: Take $\alpha,\beta\in\Omega^{k}(\mathcal{M},\mathrm{Ad}(P))$. If we take some local frame $\{e_{a}\}_{a=1}^{\mathrm{dim}(G)}\subset\Gamma(U,\mathrm{Ad}(P))$, which is a family of local sections of $\mathrm{Ad}(P)$, such that they form at each point a basis, then we can write $$\alpha\vert_{U}=\sum_{a=1}^{\mathrm{dim}(G)}\alpha^{a}\otimes e_{a}\hspace{1cm}\text{and}\hspace{1cm}\beta\vert_{U}=\sum_{a=1}^{\mathrm{dim}(G)}\beta^{a}\otimes e_{a},$$ where $\alpha^{a},\beta^{a}\in\Omega^{k}(U)$ are real-valued forms. Using this, we define the iner product by $$\langle\alpha,\beta\rangle_{\mathrm{Ad}(P)}\vert_{U}:=\sum_{a,b=1}^{\mathrm{dim}(G)}\langle\alpha^{a},\beta^{b}\rangle\langle e_{a},e_{b}\rangle_{\mathfrak{g}},$$ where $\langle\alpha^{a},\beta^{b}\rangle$ denotes just the usual inner product of real-valued forms and where $\langle e_{a},e_{b}\rangle_{\mathfrak{g}}$ has to be understood point-wise. After all this preliminaries, we define the Yang-Mills action via

$$\mathcal{S}_{\mathrm{YM}}[A]:=\int_{\mathcal{M}}\Vert F^{A}_{\mathcal{M}}\Vert^{2}_{\mathrm{Ad}(P)}\,\mathrm{d}\mathrm{vol}_{g}$$

where $F_{\mathcal{M}}^{A}$ is the curvature viewed as an element of $\Omega^{2}(\mathcal{M},\mathrm{Ad}(P))$ via the above isomorphism and where $\mathrm{d}\mathrm{vol}_{g}$ denotes the usual measure on pseudo-Riemannian manifolds. In order to derive the equations of motion, we first of all observe that

$$F^{A+t\alpha}=F^{A}+t(\mathrm{d}\alpha+[A\wedge\alpha])+\frac{t^{2}}{2}[\alpha\wedge\alpha]$$ for all $\alpha\in\Omega^{1}_{\mathrm{hor}}(P,\mathfrak{g})^{\mathrm{Ad}(P)}$. As a consequence, we have that $$F^{A+t\alpha}_{\mathcal{M}}=F_{\mathcal{M}}^{A}+t(\mathrm{d}_{A}\alpha_{\mathcal{M}})+\mathcal{O}(t^{2}),$$ where $\alpha_{\mathcal{M}}\in\Omega^{1}(\mathcal{M},\mathrm{Ad}(P))$ corresponds to the form $\alpha$ via the isomorphism explained above. As a last ingredient, we need the following fact: $$\int_{\mathcal{M}}\langle\mathrm{d}_{A}\alpha,\beta\rangle_{\mathrm{Ad}(P)}\,\mathrm{vol}_{g}=\int_{\mathcal{M}}\langle\alpha,\mathrm{d}_{A}^{\ast}\beta\rangle_{\mathrm{Ad}(P)}\,\mathrm{vol}_{g}$$ for all $\alpha\in\Omega^{k}(\mathcal{M},\mathrm{Ad}(P))$ and for all $\beta\in\Omega^{k+1}(\mathcal{M},\mathrm{Ad}(P))$, where $$\mathrm{d}^{\ast}_{A}:\Omega^{k+1}(\mathcal{M},\mathrm{Ad}(P))\to\Omega^{k}(\mathcal{M},\mathrm{Ad}(P))$$ denotes the codifferential. This formula is basically an extension of the well-knwon theorem for real-valued forms, that states that the exterior derivative is formally self-adjoint to the codifferential with respect to a suitable $L^{2}$-inner product, to bundle-valued differential forms. Using this, we get that $$\frac{\mathrm{d}}{\mathrm{d}t}\mathcal{S}_{\mathrm{YM}}[A+t\alpha]=2\int_{\mathcal{M}}\,\langle\mathrm{d}_{A}\alpha_{M},F_{\mathcal{M}}^{A}\rangle_{\mathrm{Ad}(P)}\,\mathrm{vol}_{g}=2\int_{\mathcal{M}}\,\langle\alpha_{M},\mathrm{d}_{A}^{\ast}F_{\mathcal{M}}^{A}\rangle_{\mathrm{Ad}(P)}\,\mathrm{vol}_{g}\stackrel{!}{=}0.$$ Hence, by non-degeneracy of the inner product, which extends to the inner product $\langle\cdot,\cdot\rangle_{\mathrm{Ad}(P)}$, we get that $$\mathrm{d}^{\ast}_{A}F_{\mathcal{M}}^{A}=0.$$ Using the definition of the codifferential, this is equivalent to say that $$\mathrm{d}_{A}\ast F_{\mathcal{M}}^{A}=0,$$ which are the well-known Yang-Mills equation.


As a short comment: It is a general fact that for every connection $1$-form $A$ it holds that $$\mathrm{d}_{A}F_{\mathcal{M}}^{A}=0.$$ This is a particular form of the "Bianchi identity". This equation together with the Yang-Mills equation can be viewed as a generalization of the well-known Maxwell's equations for electrodynamics.

Related Question