Given $F_\mu=\partial_\mu\phi$, I need to find the functional derivative $\frac{\delta F}{\delta \phi}$. I am not familiar with the treatment of functional derivatives outside the context of finding Euler-Lagrange equations from the action and I'm struggling to apply various definitions of the functional derivative to this situation. What is the best way to proceed?
Field Theory – Calculating the Functional Derivative of ??? with Respect to ?
field-theoryfunctional-derivativeshomework-and-exercisesvariational-calculus
Related Solutions
Contrary to your claim near the end of your question, I claim that the time-derivative of the field is being treated as an "independent" argument of the Lagrangian. I'll try to convince you of this by showing you how this independence leads to everything working out the way you think it should. Some of the key points are at the end, so please read all the way through before you succumb to skepticism.
For the sake of simplicity, let's assume from the start that we are considering a classical theory of fields $\phi:\mathbb R^2\to\mathbb R$. Let $\mathcal F$ denote the set of admissible fields in this theory. We denote the first field argument with $t$ and the second argument with $x$, so we write $\phi(t,x)$ as usual.
Ok so, now let's turn to the Lagrangian. To describe this correctly, imagine taking the $x$ argument of a field in our theory fixed, then this yields a real-valued function of a single, real variable $\phi(\cdot, x):\mathbb R\to\mathbb R$. Suppose that $\mathcal G$ denotes the set of such functions. Then the Lagrangian can be defined as a functional $L:\mathcal F\times\mathcal F\to\mathcal G$. In other words, it takes in two functions that map $\mathbb R^2\to\mathbb R$ and outputs a function that maps $\mathbb R\to\mathbb R$. We label the first argument suggestively by $\phi$ and the second argument suggestively by $\dot \phi$, but in principle, one can evaluate $L$ on whatever fields $\phi$ and $\psi$ that one chooses and write, for example $L[\phi, \psi]$. I claim that the definitions of the relevant functional derivatives are as follows: \begin{align} \frac{\delta L}{\delta \phi(t,x)}[\phi,\dot\phi](t) &= \lim_{\epsilon\to 0}\frac{L[\phi+\epsilon\Delta_x,\dot\phi](t)-L[\phi,\dot\phi](t)}{\epsilon} \\ \frac{\delta L}{\delta \dot\phi(t,x)}[\phi,\dot\phi](t) &= \lim_{\epsilon\to 0}\frac{L[\phi,\dot\phi+\epsilon\Delta_x](t)-L[\phi,\dot\phi](t)}{\epsilon} \end{align} where, I am using the notation \begin{align} \Delta_{x}(t,x') = \delta(x'-x) \end{align} Notice that this is essentially like taking partial derivatives because we vary the arguments of $L$ independently.
Now, suppose that we have a theory described by a Lagrangian density that is a local function of the field and its first derivatives. Then the Lagrangian density is defined as a function $\mathscr L:\mathbb R^3\to\mathbb R$, and, because we anticipate that we will be putting the values of the field and its derivatives into the arguments of the Lagrangian density, we label its three arguments with the symbols $\phi, \dot \phi, \phi'$. The symbols $\dot\phi$ and $\phi'$ are supposed to suggestively indicate that the arguments of the Lagrangian density are meant to be evaluated on the values of a field and its time and space derivative. This is, of course, a bit of an abuse of notation since $\phi$ is usually reserved as a symbol for the field, a function $\mathbb R^2\to\mathbb R$, not for the values of the field. But as long as we keep this abuse of notation in mind, we shouldn't get confused. Then we have \begin{align} L[\phi, \dot\phi](t) = \int dx' \,\mathscr L(\phi(t,x'), \dot\phi(t,x'), \phi'(t,x')) \end{align} Now let's apply the definitions of the functional derivatives above and see what we get. For one thing, we have \begin{align} \frac{\delta L}{\delta \dot\phi(t,x)}[\phi,\dot\phi](t) &= \lim_{\epsilon\to0}\frac{\int dx'\,\mathscr{L}(\phi(t,x'),\dot{\phi}(t,x')+\epsilon\delta(x'-x),\partial_{x'}\phi(t,x'))-\int dx'\,\mathscr{L}}{\epsilon} \\ &= \int dx'\,\frac{\partial\mathscr L}{\partial \dot\phi}(\phi(t,x'), \dot\phi(t,x'), \phi'(t,x'))\,\delta(x'-x) \\ &= \frac{\partial\mathscr L}{\partial \dot\phi}(\phi(t,x),\dot\phi(t,x),\phi'(t,x)) \end{align} which is exactly what you said you should get in your question. Similarly, I'll leave it to you to show that the definition above yields \begin{align} \frac{\delta L}{\delta \phi(t,x)}[\phi,\dot\phi](t) &= \frac{\partial\mathscr L}{\partial \phi}(\phi(t,x),\dot\phi(t,x),\phi'(t,x)) \\ &\hspace{2cm}-\frac{\partial}{\partial x}\left[\frac{\partial\mathscr L}{\partial \phi'}(\phi(t,x),\dot\phi(t,x),\phi'(t,x))\right] \end{align} or, if we relax the notation a bit since we know what we're doing now, we can summarize this as \begin{align} \frac{\delta L}{\delta \dot\phi} = \frac{\partial\mathscr L}{\partial\dot\phi}, \qquad \frac{\delta L}{\delta \phi} = \frac{\partial \mathscr L}{\partial \phi} - \frac{\partial}{\partial x}\frac{\partial \mathscr L}{\partial \phi'} \end{align} Now, suppose that we want to obtain the Euler-Lagrange equations. For this, we define the action for our theory as a function $S:\mathcal F\to\mathbb R$ as follows: \begin{align} S[\phi]=\int dt \,L[\phi, \dot\phi](t) \end{align} Notice that here, the symbol $\dot\phi$ does denote the partial time-derivative of the field $\phi$, namely $\dot\phi = \partial_t\phi$. The key point here is that even though the arguments of the Lagrangian are independent, we always have the freedom to evaluate the arguments on a field and its derivative which are certainly not independent. In particular, this means that if we vary the action, then in the integral on the right hand side, we can perform the sort of integration by parts that you were worried we wouldn't be able to do. In fact, if you vary the action, then you'll find that \begin{align} \delta S[\phi] &= \int dt\,dx\,\left[\frac{\delta L}{\delta\phi} - \frac{\partial}{\partial t}\frac{\delta L}{\delta\dot\phi}\right]\delta\phi \end{align} so setting the variation to zero, and using the results I derived above using the claimed definitions of the partial variational derivatives, we obtain the standard Euler-Lagrange equations \begin{align} \frac{\partial \mathscr L}{\partial \phi} -\frac{\partial}{\partial t}\frac{\partial \mathscr L}{\partial \dot\phi} - \frac{\partial}{\partial x}\frac{\partial \mathscr L}{\partial \phi'}=0. \end{align}
Yes, OP is right. In the field-theoretic case, the partial derivatives in OP's first formula (1) should be replaced with functional derivatives
$$ \delta S~=~\int_{t_1}^{t_2}\!\mathrm{d}t\left(\frac{\delta L}{\delta q}~\delta q+\left. \frac{\delta L}{\delta v}\right|_{v=\dot{q}}~\delta \dot{q}\right),\tag{1'}$$
where the Lagrangian
$$L[q(\cdot,t),v(\cdot,t);t]~=\int \! \mathrm d^3x~ {\cal L}(q(x,t),v(x,t), ~\partial_x q(x,t), \partial_x v(x,t),~\ldots , t) $$
is a functional. The ellipsis $\ldots$ indicates dependence of possible higher-order derivatives. See my Phys.SE answers here and here for further details.
Related Question
- Field Theory – Is $ \partial_{\mu} $ Strictly a Partial Derivative?
- General Relativity – How Do Functional Derivative and Variation of Action $S$, Lagrangian $L$, Lagrangian Density $\mathcal{L}$, and Lagrangian 4-Form $\mathbf{L}$ Differ?
- Lagrangian Formalism – How to Relate the Functional Derivative to Infinitesimal Change in Noether’s Theorem
Best Answer
I trust you know that, in $d$ dimensions, $$\frac{\delta \phi(x)}{\delta \phi(y)} = \delta^{(d)}(x-y).$$
This comes from the fact that given a function $f(x)$, we can treat it as if it were a functional by noticing that $$f(x) = \int f(y) \delta^{(d)}(x-y) \ \mathrm{d}^dy \tag{1}$$ and proceeding with the usual rules.
Since most of the remaining work has already been done by OP in the comments, I'll complete the answer. Notice we can write $$F_\mu(x) = \int \partial^y_\mu \phi(y) \delta^{(d)}(x-y) \ \mathrm{d}^d y,$$ where $\partial^y_\mu$ denote the derivative with respect to $y$. Integrating by parts, we see that \begin{align} F_\mu(x) &= \int \partial^y_\mu \phi(y) \delta^{(d)}(x-y) \ \mathrm{d}^d y, \\ &= \oint_{\text{surface}} \phi(y) \delta^{(d)}(x-y) \ \mathrm{d}S_\mu - \int \phi(y) \partial^y_\mu\delta^{(d)}(x-y) \ \mathrm{d}^d y, \end{align} where we wrote the volume integral of the gradient as a surface integral. Since the original integral was in the entire space, the surface over which we are integrating is at infinity, away from $y=x$. Since the delta will vanish everywhere apart from $y=x$, we get to
\begin{align} F_\mu(x) &= - \int \phi(y) \partial^y_\mu\delta^{(d)}(x-y) \ \mathrm{d}^d y, \\ &= \int \phi(y) \partial^x_\mu\delta^{(d)}(x-y) \ \mathrm{d}^d y, \end{align} where we used the antisymmetry in $x \leftrightarrow y$ of $\delta^{(d)}(x-y)$ to write $\partial^x_\mu\delta^{(d)}(x-y) = - \partial^y_\mu\delta^{(d)}(x-y)$. Now we see that \begin{align} \frac{\delta F_\mu(x)}{\delta \phi(z)} &= \int \frac{\delta \phi(y)}{\delta \phi(z)} \partial^x_\mu\delta^{(d)}(x-y) \ \mathrm{d}^d y, \\ &= \int \delta^{(d)}(y - z) \partial^x_\mu\delta^{(d)}(x-y) \ \mathrm{d}^d y, \\ &= \partial^x_\mu\delta^{(d)}(x-z). \end{align}
The choice of using different arguments in the functional derivative is for generality: we could consider $x=z$ just as we can consider $\delta^{(d)}(0)$ instead of the more general $\delta^{(d)}(x-z)$. However, this is less general and might lead to divergences if the functional derivative turns out to be a distribution, as is often the case.