Differential Geometry – Varying a Tensor with Respect to the Metric

differential-geometrydifferentiationmetric-tensortensor-calculusvariational-calculus

Upon learning about the Lagrangian formulation of GR, where varying an action with respect to a metric (in order to, for instance, arrive at the Einstein field equations) is common, I can't help but wonder how this operation is even well defined. The reason is that you can write down a tensor in a number of different ways by raising and lowering its indices (or contracting them) with the metric itself, so that you can write it with any number of metric tensors multiplied in front of another tensor with the appropriate combination of upstairs/downstairs indices. This means that varying a tensor with respect to a metric can give a variety of different answers depending on how you write down your tensor (in particular on how many metric tensors that have been extracted out in the expression for the tensor).

Let me give a concrete example. Consider a rank (0,4) tensor $C_{\mu\nu\rho\sigma}$. We assume it satisfies the relation $$C_{\mu\nu\rho\sigma}=C_{\nu\mu\sigma\rho}\tag{1}$$ We can contract the first and third indices: $$C_{\nu\sigma}=g^{\mu\rho}C_{\mu\nu\rho\sigma}\tag{2}.$$ We can also form a scalar by contracting once more: $$C=g^{\nu\sigma}C_{\nu\sigma}\tag{3}.$$ Hence we also have $$C=g^{\nu\sigma}g^{\mu\rho}C_{\mu\nu\rho\sigma}\tag{4}$$

Now say we wish to calculate the expression $\frac{{\delta}C}{{\delta}g^{\alpha\beta}}$. If we substitute eq. 3 for $C$, and vary with respect to $g^{\alpha\beta}$ while keeping $C_{\nu\sigma}$ fixed, we would get $$\frac{{\delta}C}{{\delta}g^{\alpha\beta}}=C_{(\alpha\beta)}\tag{5},$$ where parentheses denote that the tensor is symmetrized. However, consider what happens if we substitute eq. 4 for $C$ and vary it while keeping $C_{\mu\nu\rho\sigma}$ fixed. Then using the product rule (since we now have two factors of the metric with $g^{\nu\sigma}$ and $g^{\mu\rho}$), exploiting eq. 1 and contracting using eq. 2, we now get $$\frac{{\delta}C}{{\delta}g^{\alpha\beta}}=2C_{(\alpha\beta)}\tag{6}.$$ Clearly, a contradiction between eq. 5 and 6. What's going on? How are you supposed to properly vary a tensor with respect to the metric? The key would seem to be to figure out what stays fixed during the variation. But the tensor I chose is a very general tensor that may not have any explicit dependence on the metric at all (you may have noticed that the Riemann tensor could easily be a candidate for $C_{\mu\nu\rho\sigma}$, but since the Riemann tensor in GR typically has an explicit dependence on the metric due to our choice of affine connection, I'll ignore that easy option). If such an explicit dependence existed, then we could simply use it to vary everything directly like we do when computing the variation of the Riemann tensor. But in general, how do you determine what should be fixed during the variation of a tensor, even when the tensor has no explicit dependence on the metric?

Best Answer

The fact that you get a different result of a variation when you hold different quantities fixed is well-known (as pointed out by @fqq in the comments.) This is a feature of regular old Math-200 partial derivatives as well.

But as far as the calculus of variations goes, the resulting equations will be equivalent regardless of what we view as the "fundamental variables" of the system; and you can choose any set of "fundamental variables" you like, so long as they completely specify the configuration of the system. For example, suppose we have a Lagrangian involving the metric and a vector field $A^\mu$. We can construct the Euler-Lagrange equations as $$ \left( \frac{\delta \mathcal{L}}{\delta g_{\mu \nu}} \right)_{A^\mu} = 0 \qquad \left( \frac{\delta \mathcal{L}}{\delta A^\mu} \right)_{g_{\mu \nu}} = 0 \tag{1} $$ But alternately, we could rewrite the Lagrangian in terms of a one-form field $A_\mu = g_{\mu \nu} A^\nu$, and view the metric $g_{\mu \nu}$ and $A_\mu$ as the "fundamental fields"; in which case the Euler-Lagrange equations would be $$ \left( \frac{\delta \mathcal{L}}{\delta g_{\mu \nu}} \right)_{A_\mu} = 0 \qquad \left( \frac{\delta \mathcal{L}}{\delta A_\mu} \right)_{g_{\mu \nu}} = 0 \tag{2} $$ The resulting Euler-Lagrange equations (2) will not be identical to those found in (1), but they will be equivalent; in general, they can be written as linear combinations of each other. (Regular old Math-200 partial derivatives also have this feature.)

In the case you propose, however, viewing the singly-traced tensor $C_{\mu \nu}$ as a fundamental field in the equations of motion would not yield a set of equations that are equivalent to those obtained by viewing $C_{\mu \nu \rho \sigma}$; and those obtained by viewing $C_{\mu \nu}$ as fundamental would be an incomplete set of equations. This is because there are multiple possible choices of $C_{\mu \nu \rho \sigma}$ that correspond to a single value of $C_{\mu \nu}$ (for a fixed metric). So specifying $C_{\mu \nu}$ does not specify the fields completely, meaning that the resulting Euler-Lagrange equations do not enforce stationarity of the action in some of those "directions in field space". It'd be like varying the quantity $$ A^2 = A_\mu A^\mu $$ with respect to the vector norm $A^2$ instead of the vector itself. If you varied this quantity with respect to the vector norm, you'd find that you had to have $A^2 = 0$, but if you varied it with respect to $A_\mu$, you'd find that you had to have $A^\mu = 0$, a more restrictive condition.