Details of using flat metric to raise/lower indices in linearized GR. I’m getting first order discrepancies

covariancegeneral-relativitylinearized-theorymetric-tensorperturbation-theory

This question is about the use of the unperturbed (Minkowski) metric $\eta_{\mu\nu}$ (and its inverse $\eta^{\mu\nu}$) to raise and lower indices in linearized gravity. There are already several questions on the Physics SE about this (here is one), but none that go into the specific details that I am wondering about.

Background

We write the metric $g_{\mu\nu}$ as a flat Minkowski metric $\eta_{\mu\nu}$ plus a small (in some sense) perturbation $h_{\mu\nu}$:

$$g_{\mu\nu} = \eta_{\mu\nu} + h_{\mu\nu}\tag{1}$$

To first order in $h_{\mu\nu}$ we find that this has inverse

$$g^{\mu\nu} = \eta^{\mu\nu} – h^{\mu\nu},\tag{2}$$

where $$h^{\mu\nu} \equiv \eta^{\mu\rho} \eta^{\nu\sigma} h_{\rho\sigma}.\tag{3}$$

Then it is asserted that we can use the Minkowski metric instead of the full perturbed metric to raise and lower indices of tensors. The standard argument (see for example Carroll) seems to be that the discrepancy between using the Minkowski and the full metric for this purpose will be at least second order in $h_{\mu\nu}$. But when I try to make this statement precise I get first order discrepancies.

My attempt

Some sources make the smallness of the perturbation a bit more precise by introducing a perturbation parameter $\lambda$ and writing

$$g_{\mu\nu} = \eta_{\mu\nu} + \lambda h_{\mu\nu},\tag{4}$$

so let us do that as well. Suppose we have a vector quantity $A^\mu$ (though it could equally well be a general tensor quantity) that somehow depends on the metric, and which in flat space takes the form $A_{(0)}^\mu$. In a perturbation series we will then write $A^\mu$ as

$$A^\mu = A_{(0)}^\mu + \lambda A_{(1)}^\mu + \lambda^2 A_{(2)}^\mu + \dotsb.\tag{5}$$

Using the full metric to lower the index, one obtains

\begin{align}
g_{\mu\nu} A^\nu &= (\eta_{\mu\nu} + \lambda h_{\mu\nu}) \left(A_{(0)}^\nu + \lambda A_{(1)}^\nu + O(\lambda^2)\right)\\
&= \eta_{\mu\nu} A_{(0)}^\nu + \lambda \left(\eta_{\mu\nu} A_{(1)}^\nu + h_{\mu\nu} A_{(0)}^\nu \right) + O(\lambda^2)\tag{6}
\end{align}

while if one instead uses the flat metric the result is

$$\eta_{\mu\nu} A^\nu = \eta_{\mu\nu} \left(A_{(0)}^\nu + \lambda A_{(1)}^\nu + O(\lambda^2)\right) = \eta_{\mu\nu} A_{(0)}^\nu + \lambda \eta_{\mu\nu} A_{(1)}^\nu + O(\lambda^2).\tag{7}$$

Hence it seems we have a first order discrepancy of $h_{\mu\nu} A_{(0)}^\nu$ between the two approaches.

So what am I missing? Am I doing the perturbation series wrong? Or am I misunderstanding the basic argument that I am trying to make precise?

Comment

I know there is a different approach to this (see for example Wald) where you define lowered and raised indices as denoting a contraction with the flat metric (or its inverse), but this is not what I am trying to understand. I am trying to understand the argument given above (under Background), that discrepancies should somehow be at least second order in the perturbation.

Best Answer

With the help of user octonion in the comments, this seems to have been resolved. Carroll writes:

In fact, we can use the Minkowski metric to raise and lower indices on an object of any definite order in $h$, since the corrections would only contribute at higher orders.

I did not quite know how to interpret that statement initially. But if we introduce the perturbation as

$$g_{\mu\nu} = \eta_{\mu\nu} + \lambda h_{\mu\nu},$$

and take "an object of any definite order in $h$" to mean something that is directly proportional to a positive integer power of $\lambda$, then everything makes sense. It is just saying that if, say, $A^\mu = \lambda B^\mu$, then

$$A_\mu = g_{\mu\nu} A^\nu = (\eta_{\mu\nu} + \lambda h_{\mu\nu}) \lambda B^\nu = \eta_{\mu\nu} \lambda B^\nu + O(\lambda^2) = \eta_{\mu\nu} A^\nu + O(\lambda^2),$$

and so on.

For example,

$$\partial^\mu g_{\nu\sigma} = \partial^\mu (\eta_{\nu\sigma} + \lambda h_{\nu\sigma}) = \lambda \partial^\mu h_{\nu\sigma} = \lambda \eta^{\mu\rho} \partial_\rho h_{\nu\sigma} + O(\lambda^2),$$

which explains how the Christoffel symbols get their form in linearized gravity, and also manifestly shows that they are first order in the perturbation. Note that this last computation is much less obvious if one introduces the perturbation without the use of $\lambda$ — i.e. by letting $h_{\mu\nu}$ itself be "small" — because then there is no parameter that can simply be moved out of the derivative.

Related Question