General Relativity – Derivative of the Lagrangian with Respect to the Metric Tensor

field-theorygeneral-relativitylagrangian-formalismmetric-tensorstress-energy-momentum-tensor

I'm trying to calculate the derivative of the Lagrangian $$\mathcal{L}=\frac{1}{2}\partial_\mu\phi\,\partial^\mu\phi-\frac{1}{2}m^2\phi^2$$ with respect to the metric tensor $g_{\mu\nu}$, with the convention $(+, -, -, -)$, in order to obtain $$T^{\mu\nu}=-g^{\mu\nu}\mathcal{L}-2\frac{\delta\mathcal{L}}{\delta g_{\mu\nu}}.$$ First I tried to do it as follows:

$$\frac{\delta\mathcal{L}}{\delta g_{\mu\nu}}=\frac{1}{2}\frac{\delta\left(g_{\alpha\beta}\partial^\alpha\phi\partial^\beta\phi\right)}{\delta g_{\mu\nu}}=\frac{1}{2}\frac{\delta g_{\alpha\beta}}{\delta g_{\mu\nu}}\partial^\alpha\phi\partial^\beta\phi=\frac{1}{4}\left(\delta_\alpha^\mu\delta_\beta^\nu+\delta_\alpha^\nu\delta_\beta^\mu\right)\partial^\alpha\phi\partial^\beta\phi=\frac{1}{2}\partial^\mu\phi\partial^\nu\phi.$$

Where I have used the formula

$$\frac{\delta g_{\alpha\beta}}{\delta g_{\mu\nu}}=\frac{1}{2}\left(\delta_\alpha^\mu\delta_\beta^\nu+\delta_\alpha^\nu\delta_\beta^\mu\right)$$

that my professor derived in class.

But, if I write $\partial_\mu\phi\partial^\mu\phi=g^{\alpha\beta}\partial_\alpha\phi\partial_\beta\phi$, I would use the expression for the derivative of the inverse metric:

$$\frac{\delta g^{\alpha\beta}}{\delta g_{\mu\nu}}=-\frac{1}{2}\left(g^{\alpha\mu}g^{\beta\nu}+g^{\alpha\nu}g^{\beta\mu}\right)$$

This formula comes from the fact that $g^{\alpha\beta}g_{\beta\rho}=\delta^{\alpha}_{\rho}$, so

$$\frac{\delta g^{\alpha\beta}}{\delta g_{\mu\nu}}g_{\beta\rho}+g^{\alpha\beta}\frac{\delta g_{\beta\rho}}{\delta g_{\mu\nu}}=0$$

which leads to

$$\frac{\delta g^{\alpha\beta}}{\delta g_{\mu\nu}}g_{\beta\rho}=-\frac{1}{2}g^{\alpha\beta}\left(\delta^\mu_\beta\delta^\nu_\rho+\delta^\mu_\rho\delta^\nu_\beta\right)=-\frac{1}{2}\left(g^{\alpha\mu}\delta^{\nu}_{\rho}+g^{\alpha\nu}\delta^{\mu}_\rho\right)$$

multiplying by $g^{\rho\varphi}$, we get

$$\frac{\delta g^{\alpha\varphi}}{\delta g_{\mu\nu}}=-\frac{1}{2}\left(g^{\alpha\mu}g^{\varphi\nu}+g^{\alpha\nu}g^{\varphi\mu}\right)$$

and then, aplying that we would get

$$\frac{\delta\mathcal{L}}{\delta g_{\mu\nu}}=-\frac{1}{2}\partial^\mu\phi\partial^\nu\phi$$

I would like to understand what's causing this problem. I suppose this question may be duplicated, but it's still not clear to me how to fix that sign error from the answers I've read. This is my first time studying GR and I would be grateful for any responses.

Best Answer

This is just elaborating a little more on the 'behind the scenes', since OP's confusions seem to be resolved in the comments already.


Consider the following simplified situation. Let $f_1,f_2:\Bbb{R}^2\to\Bbb{R}$ be two functions defined as $f_1(x,y)=x^2y^3$ and $f_2(x,y)=xy^2$. These are clearly two different functions. Consider now two curves, $\gamma_1,\gamma_2:\Bbb{R}\to\Bbb{R}^2$ defined as $\gamma_1(t)=(t,t)$ and $\gamma_2(t)=(t,t^2)$. Then, you can easily verify that the composed maps are equal: for all $t\in\Bbb{R}$, we have $(f_1\circ\gamma_1)(t)=(f_2\circ\gamma_2)(t)=t^5$. On the other hand, let us calculate their partial derivative: \begin{align} \frac{\partial f_1}{\partial x}\bigg|_{\gamma_1(t)}=2t^4,\quad\text{but}\quad\frac{\partial f_2}{\partial x}\bigg|_{\gamma_2(t)}=t^4. \end{align} This shouldn't be surprising: we started off with two different functions $f_1,f_2$, and we just happened to find two curves $\gamma_1,\gamma_2$ such that $f_1\circ\gamma_1=f_2\circ\gamma_2$. There's no reason to expect that this implies $\frac{\partial f_1}{\partial x}\circ \gamma_1= \frac{\partial f_2}{\partial x}\circ \gamma_2$, and in fact as shown above, this equality is false.


How does this relate to the Lagrangian? Fix any smooth manifold $M$, and consider the mappings

  • $\mathscr{L}_1: \Gamma(T^0_2(M))\times \Gamma(T^1_0(M))\to C^{\infty}(M)$ defined as $\mathscr{L}_1(H,\xi)=H(\xi,\xi)=H_{ab}\xi^a\xi^b.$
  • $\mathscr{L}_2: \Gamma(T^2_0(M))\times \Gamma(T^0_1(M))\to C^{\infty}(M)$ defined as $\mathscr{L}_2(K,\omega)=K(\omega,\omega)=K^{ab}\omega_a\omega_b.$

In words, $\mathscr{L}_1$ eats a $(0,2)$-tensor field in its first slot, and a vector field (a $(1,0)$ tensor field) in its second slot, and it outputs a smooth function by contracting the tensor field and vector field completely; $\mathscr{L}_2$ does a similar thing (contraction) except it has a different domain. Now, without any doubt, $\mathscr{L}_1$ and $\mathscr{L}_2$ are completely different maps.

Now, let us fix a scalar field $\phi$ on $M$. We now get two induced mappings via composition, denoted $\mathcal{L}_1$ and $\mathcal{L}_2$, defined on the space of metric tensors and taking values in $C^{\infty}(M)$, such that

  • $\mathcal{L}_1[g]:= \mathscr{L}_1(g,\text{grad}_g(\phi)):=\mathscr{L}_1(g,g^{\sharp}(d\phi))=g_{ab}\partial^a\phi\partial^b\phi$.
  • $\mathcal{L}_2[g]:= \mathscr{L}_2(g^{``-1"}, d\phi)= g^{ab}\partial_a\phi\partial_b\phi$.

Here, $g^{\sharp}$ denotes the musical isomorphism which converts covector fields into vector fields (the index-raising operation), and $g^{``-1"}$ denotes the 'inverse' metric tensor (I put 'inverse' in quotation marks since a $(0,2)$ tensor field strictly speaking doesn't have an inverse; rather we refer to a corresponding $(2,0)$ tensor).

So you see, the composed functions $\mathcal{L}_1$ and $\mathcal{L}_2$ are equal. However, the variations $\frac{\delta \mathscr{L}_1}{\delta H}\bigg|_{(g,\text{grad}_g\phi)}$ and $\frac{\delta\mathscr{L}_2 }{\delta K}\bigg|_{(g^{``-1"}, d\phi)}$ are not equal.

Hopefully the analogy with the above simple case is clear: $\mathscr{L}_i$ is like $f_i$, the map $g\mapsto (g,\text{grad}_g(\phi))$ is like the curve $\gamma_1$, and the map $g\mapsto (g^{``-1"},d\phi)$ is like the curve $\gamma_2$, and it turns out their compositions are equal: $\mathcal{L}_1=\mathcal{L}_2$. But that doesn't mean the original maps are equal, nor does it imply the composition of the derivatives along these 'curves'are equal (remember in variational calculus, we always perform the variation first, and only afterwards evaluate).

So roughly speaking, your first calculation corresponds to $\mathscr{L}_1$, where we view $\partial^a\phi$ as indpendent variables, whereas in the latter case we view $\partial_a\phi$ as the independent variables. In going from one to the other, there are factors of the metric which appear. Lastly, in physics, we view the second situation as more 'fundamental', i.e $\partial_a\phi$ is the basic quantity (afterall the exterior derivative $d\phi$ can be defined without any metric).

Related Question