General Relativity – Transformation Law for the Covariant Derivative

coordinate systemscovariancedifferential-geometrygeneral-relativity

In Carroll's GR book, he pretty much defines the covariant derivative of a vector $V^{\nu}$ as:
$$\nabla_\mu V^{\nu}=\partial_\mu V^{\nu}+\Gamma_{\mu\lambda}^{\nu}V^\lambda$$
where $\Gamma_{\mu\lambda}^{\nu}$ is the Christoffel symbol. For the transformation law, we'd of course have:
$$\nabla_{\mu'} V^{\nu'}=\partial_{\mu'} V^{\nu'}+\Gamma_{\mu'\lambda'}^{\nu'}V^{\lambda'}.$$
As previously developed, the coordinate basis for the tangent space is $\hat{e}_{(\mu)}=\partial_\mu\implies V=\partial_\mu V^\mu\ $ for every vector in $T_p$. The transformation laws for the basis and the vectors are:
$$\partial_{\mu'}=\dfrac{\partial x^{\mu}}{\partial x^{\mu'}}\partial_\mu\ ; \ V^{\mu'}=\dfrac{\partial x^{\mu'}}{\partial x^\mu}V^\mu$$
where the law for the basis comes from the chain rule, and for the vectors simply by demanding invariance of $V^\mu$ under change of basis.

So, what he proceeds to do is transform the parts that he already developed, in the second equation, which up to this point is every part except the Christoffel symbol. But, as I would not expect, he gets:
$$\nabla_{\mu'}V^{\nu'}=\dfrac{\partial x^\mu}{\partial x^\mu{'}}\dfrac{\partial x^\nu{'}}{\partial x^\nu}\partial_\mu V^\nu+\dfrac{\partial x^\mu}{\partial x^\mu{'}}V^\nu\dfrac{\partial}{\partial x^\mu}\dfrac{\partial x^\nu{'}}{\partial x^\nu}+\Gamma_{\mu'\lambda'}^{\nu'}\dfrac{\partial x^{\lambda'}}{\partial x^{\lambda}}V^{\lambda}$$

The first and last terms are trivial. It's the middle one that I cannot get around, what does it even mean for it to be there? Firstly, for some reason, he uses $\dfrac{\partial}{\partial x^\mu}$ which is the same as $\partial_\mu$, the notation he uses throughout the book. I can't see how that middle term is any different from the previous one, I don't know what role it's playing in the transformation law. It's like he solely exchanged the order or things and repeated the first term.

It's very unlikely that that's the case, so I'm missing something quite simple here. That's the question, why do we have this middle term, instead of only the first and last ones?

Any help will be greatly appreciated.

Best Answer

The middle term doesn't necessarily have to make sense: it is just a part of the way the Christoffel symbols transform, and those do not transform as a tensor.

The reason he uses $\frac{\partial}{\partial x^\mu}$ instead of $\partial_\mu$ is a matter of taste. He decided that in this derivation it was more clear to use the 'full' notation instead of the shorthand version. In this particular case it is slightly easier to manipulate these kinds of expressions, especially if you are not familiar with the shorthand notation in combination with transformation laws.

To see how the middle term is different from the first term it might be useful to use an explicit example. Associate $\mu,\nu$ with the coordinates $(x,y)$ and $\mu',\nu'$ with $(r,\theta)$. We have \begin{align} \cases{x(r,\theta)=r\cos\theta\\ y(r,\theta)=r\sin\theta} \end{align} and \begin{align} \cases{r(x,y)=\sqrt{x^2+y^2}\\ \theta(x,y)=\arctan2(y,x)&$\theta\in[-\pi,\pi)$ } \end{align} Let's pick as an arbitrary example a term where $(\mu,\nu,\mu',\nu')=(x,y,\theta,r)$. The first term becomes

\begin{align} \frac{\partial x^{\mu}}{\partial x^{\mu'}}\frac{\partial x^{\nu'}}{\partial x^{\nu}}=(\partial_\theta x)(\partial_yr)=(-r\sin\theta)\left(\frac{y}{\sqrt{x^2+y^2}}\right) \end{align} The middle term becomes \begin{align} \frac{\partial x^{\mu}}{\partial x^{\mu'}}\frac{\partial }{\partial x^{\mu}}\frac{\partial x^{\nu'}}{\partial x^{\nu}}&=(\partial_\theta x)(\partial_x\partial_y r)\\ &=(-r\sin\theta)\left(\frac{-xy}{(x^2+y^2)^{3/2}}\right) \end{align} I hope this explicit example made it more clear what is happening.

The reason this middle term pops up is that when you plug in the transformation law for $\partial_{\mu'} V^{\nu'}$, the derivative still acts on the entirety of $V$. So when you plug in $V^{\nu'}=\frac{\partial x^{\nu'}}{\partial x^{\nu}}V^{\nu}$ the derivative acts on both terms and you get two terms from the product rule (Leibniz rule).