Mathematical Physics – Real K-G Equation as EOM for Given Action

functional-analysismathematical physicsproof-explanationquantum-field-theorytensors

This question concerns a real scalar field.

Show that the real Klein-Gordon equation, $(\Box + m^2)\phi=0$ is the equation of motion, $\delta S[\phi(x)]/\delta\phi(x)=0$, for the action

$$S=\frac12\int d^4x\left(\partial^\mu{\phi}\partial_\mu{\phi}-m^2\phi^2\right)$$
by performing a functional variation. Here $\mu = 0,1,2,3$, ($3+1$ space-time dimensions).


The Lagrangian density is
$$\mathcal{L}=\mathcal{L}\left(\phi,\ \partial^\mu{\phi},\ \partial_\mu{\phi}\right)=\partial^\mu{\phi}\partial_\mu{\phi}-m^2\phi^2\tag{A}$$

Now since the change in functional variation $$\frac{\delta S[\phi(x)]}{\delta\phi(x)}=\lim_{\delta \phi\to 0}\left(\frac{S(\phi + \delta\phi)-S(\phi)}{\delta\phi}\right)=0$$
It follows that $S(\phi + \delta\phi)-S(\phi)$ must be zero for any $\delta\phi$.

The variation in the action, $S$ is therefore
$$\delta{S[\phi]}=\frac12\int d^4x\left[\phi+\delta\phi,\ \partial^\mu{\phi}+\delta\left(\partial^\mu{\phi}\right),\ \partial_\mu{\phi}+\delta(\partial_\mu{\phi})\right]$$ and after Taylor expanding to first order in $\delta\phi$ becomes
$$\delta{S[\phi(x)]}$$
$$=\int \frac{d^4x}{2} \left[\mathcal{L}\left(\phi, \partial^\mu{\phi}, \partial_\mu{\phi}\right)+\frac{\partial\mathcal{L}}{\partial \phi}\delta\phi+\frac{\partial\mathcal{L}}{\partial\left(\partial^\mu\phi\right)}\delta\left(\partial^\mu\phi\right)+\frac{\partial\mathcal{L}}{\partial\left(\partial_\mu\phi\right)}\delta\left(\partial_\mu\phi\right)-\mathcal{L}\left(\phi, \partial^\mu{\phi}, \partial_\mu{\phi}\right)\right]$$

$$=\frac12\int d^4x\left[-2m^2\phi\delta\phi+\partial_\mu{\phi}\delta\left(\partial^\mu\phi\right)+\partial^\mu{\phi}\delta\left(\partial_\mu\phi\right)\right]\tag{B}$$


This is as far as I can get and matches the first line of the solution. I will typeset this solution in exactly the same way as the author did to illustrate my confusion:

Take the action

$$S=\frac12\int d^4x\left(\partial^\mu{\phi}\partial_\mu{\phi}-m^2\phi^2\right)$$

Now we vary it;
$$\delta{S} = \frac12\int d^4x\left[\left(\delta\partial^\mu\phi\right)\partial_\mu{\phi}+\partial^\mu{\phi}\delta\left(\partial_\mu\phi\right)-2m^2\phi\delta\phi\right]\tag{1}$$
$$= \frac12\int d^4x\left[\color{#085}{\left(\partial^\mu\delta\phi\right)\partial_\mu{\phi}}+\partial^\mu{\phi}\left(\partial_\mu\delta\phi\right)-2m^2\phi\delta\phi\right]\tag{2}$$
$$= \int d^4x\left[\partial^\mu\phi\partial_\mu{\delta\phi}-m^2\phi\delta\phi\right]\tag{3}$$
$$=\int d^4x \left[-\partial_\mu (\partial^\mu\phi)\delta\phi-m^2\phi\delta\phi\right]+\int d^4x \partial_{\mu}\left(\delta\phi\partial^\mu\phi\right)\tag{4}$$
and we drop the last term, the total divergence due to boundary conditions, so,
$$\delta{S}=-\int d^4x \left(\partial^2\phi+m^2\phi\right)\delta\phi\tag{5}$$


Was it correct to have 3 arguments for the Lagrangian density in $(\mathrm{A})$, namely to distinguish between the contravariant and covariant derivatives?

What is the justification for commuting the $\delta$ past the derivative in going from $(1)$ to $(2)$?

How did the author get from $(2)$ to $(3)$? It's almost as if the first term in the integrand of $(2)$ (marked green) has been forgotten about.

To go from eqn. $(3)$ to $(4)$ I think integration by parts has been used to factor the $\delta\phi$ out, so integrating the first term of $(3)$ by parts gives $$\int\partial^\mu{\phi}\left(\partial_\mu\delta\phi\right)d^4x=\color{blue}{\left[\partial^\mu{\phi}\int\partial_\mu\left(\delta\phi\right) d^4x \right]}-\int\bigg(\partial_\mu(\partial^\mu\phi)\delta\phi\bigg) d^4x$$
This explains the negative sign for the first term in $(4)$, but why does the boundary term (marked blue) not match the final term of equation $(4)$?

I don't really understand why the final term of $(4)$ can be dropped, but a more pressing question I have is how $(5)$ was deduced from $(4)$.

Put another way, it was my understanding that the covariant derivative is such that $$\partial_\mu\equiv\frac{\partial}{\partial x^\mu}\tag{C}$$
and the contravariant derivative is defined as
$$\partial^\mu\equiv\frac{\partial}{\partial x_\mu}\tag{D}$$
but the way it's written in going from $(4)$ to $(5)$ suggests that
$$\partial_\mu(\partial^\mu\phi)\stackrel{\color{red}{\mathrm{?}}}{=}\partial^2\phi$$
But how can this possibly be true? Okay, so the $\mu$ index is summed over in accordance with the Einstein summation convention since it is a repeated (dummy) index, but by virtue of $(\mathrm{C})$ and $(\mathrm{D})$ without the $\mu$ index $\partial^2\phi$ does not tell me which variable the field, $\phi$ is being differentiated with respect to.


Update:

I've been given a good answer that addresses most of my questions nicely. The only part that still puzzles me is how the author was able to immediately write down eqn. $(1)$ in the solution. Is it blatantly obvious that the functional variation, $\delta{S} = \frac12\int d^4x\left[\left(\delta\partial^\mu\phi\right)\partial_\mu{\phi}+\partial^\mu{\phi}\delta\left(\partial_\mu\phi\right)-2m^2\phi\delta\phi\right]$? I had to go through several lines of logical reasoning to justify that equation, including a Taylor expansion.


Closing remarks

With the bounty time reaching its conclusion and my subsequent comments below one of the answers I know there may not be time to address these questions before the bounty ends. So I'll award the bounty regardless of whether I get a reply to these comments.

I would just like to say a massive thanks to all those that took time and effort to write such great answers. I know I haven't been very good at keeping on top of this question, I just wish I had more time.

Best Answer

We will use the following definition of functional differentiation: \begin{equation} \frac{\delta S}{\delta\phi(y)}=\lim_{\epsilon\to 0}\frac{1}{\epsilon}\left(S[\phi+\epsilon\delta^{(4)}(x-y)]-S[\phi]\right), \end{equation} where $\delta^{(4)}(x-y)$ is the Dirac delta function.

Since \begin{equation} S[\phi]=\frac{1}{2}\int d^4x\left(\partial^\mu{\phi}\partial_\mu{\phi}-m^2\phi^2\right), \end{equation} we have \begin{align} S[\phi+\epsilon\delta^{(4)}(x-y)]&=\frac{1}{2}\int d^4x[\partial^\mu\left(\phi+\epsilon\delta^{(4)}(x-y)\right)\partial_\mu\left(\phi+\epsilon\delta^{(4)}(x-y)\right)\\ &-m^2\left(\phi+\epsilon\delta^{(4)}(x-y)\right)^2]\\ &=\frac{1}{2}\int d^4x[\left(\partial^\mu\phi+\epsilon\partial^\mu\delta^{(4)}(x-y)\right)\left(\partial_\mu\phi+\epsilon\partial_\mu\delta^{(4)}(x-y)\right)\\ &-m^2\left(\phi+\epsilon\delta^{(4)}(x-y)\right)^2]\\ &=S[\phi]+\frac{\epsilon}{2}\int d^4x[(\partial^\mu\phi)\partial_\mu\delta^{(4)}(x-y)+(\partial^\mu\delta^{(4)}(x-y))\partial_\mu\phi\\ &-2m^2\phi\delta^{(4)}(x-y)]+\mathcal{O}(\epsilon^2) \end{align} Since $X^\mu Y_\mu=Y^\mu X_\mu$, we have $(\partial^\mu\delta^{(4)}(x-y))\partial_\mu\phi=(\partial^\mu\phi)\partial_\mu\delta^{(4)}(x-y)$. Therefore, \begin{align} S[\phi+\epsilon\delta^{(4)}(x-y)]&=S[\phi]+\epsilon\int d^4x[(\partial^\mu\phi)\partial_\mu\delta^{(4)}(x-y)-m^2\phi\delta^{(4)}(x-y)]+\mathcal{O}(\epsilon^2). \end{align} Because of the integration by parts, we know that \begin{align} \int d^4x(\partial^\mu\phi)\partial_\mu\delta^{(4)}(x-y)=\int d^4x\partial_\mu((\partial^\mu\phi)\delta^{(4)}(x-y))-\int d^4x(\Box\phi)\delta^{(4)}(x-y), \end{align} where we have used the definition $\partial_\mu\partial^\mu=\Box$.

Discarding boundary terms, we find \begin{align} S[\phi+\epsilon\delta^{(4)}(x-y)]&=S[\phi]-\epsilon\int d^4x(\Box\phi+m^2\phi)\delta^{(4)}(x-y)+\mathcal{O}(\epsilon^2)\\ &=S[\phi]-\epsilon(\Box\phi(y)+m^2\phi(y))+\mathcal{O}(\epsilon^2). \end{align} Substituting this into the definition of functional differentiation, we get \begin{equation} \frac{\delta S}{\delta\phi(y)}=\lim_{\epsilon\to 0}\left(-(\Box\phi(y)+m^2\phi(y))+\mathcal{O}(\epsilon)\right)=-(\Box\phi(y)+m^2\phi(y)). \end{equation} Finally, by using the condition \begin{equation} \frac{\delta S}{\delta\phi(y)}=0, \end{equation} we obtain the following equation of motion \begin{equation} \Box\phi+m^2\phi=0. \end{equation}

Addressing the question in the update of your post.

My answer is no, it is not blatantly obvious. For this reason, let me do the following identification [See section 2.3 Functional Derivatives in Greiner & Reinhardt, Field Quantization] $$\delta\phi(x)=\epsilon\delta^{(4)}(x-y).$$ Having done that, note that $$\delta\left(\partial^\mu\phi(x)\right)=\partial^\mu\phi^\prime(x)-\partial^\mu\phi(x)=\partial^\mu\left(\phi^\prime(x)-\phi(x)\right)=\partial^\mu\left(\delta\phi(x)\right)$$. Therefore, $$\delta\left(\partial^\mu\phi(x)\right)=\epsilon\partial^\mu\delta^{(4)}(x-y).$$ Now, we have all it is necessary to do the following demonstration.$$\delta S[\phi]=S[\phi+\delta\phi]-S[\phi]=S[\phi+\epsilon\delta^{(4)}(x-y)]-S[\phi].$$ We have already done the calculation of $S[\phi(x)+\epsilon\delta^{(4)}(x-y)]-S[\phi]$ before. Therefore, we get \begin{align} \delta S[\phi]&=\frac{\epsilon}{2}\int d^4x[(\partial^\mu\phi)\partial_\mu\delta^{(4)}(x-y)+(\partial^\mu\delta^{(4)}(x-y))\partial_\mu\phi\\ &-2m^2\phi\delta^{(4)}(x-y)]+\mathcal{O}(\epsilon^2)\\ \Rightarrow \delta S[\phi]&=\frac{1}{2}\int d^4x[(\partial^\mu\phi)\delta(\partial_\mu\phi)+\delta(\partial^\mu\phi)\partial_\mu\phi\\ &-2m^2\phi\delta\phi]+\mathcal{O}\left((\delta\phi)^2\right), \end{align} where we have used $\delta\phi(x)=\epsilon\delta^{(4)}(x-y)$ and $\delta\left(\partial^\mu\phi(x)\right)=\epsilon\partial^\mu\delta^{(4)}(x-y)$.