[Math] Covariant Derivative of Basis Vectors

calculusconnectionsdifferential-geometryriemannian-geometrytensors

Consider the standard covariant derivative of Riemannian Geometry (torsion free with metric compatibility) in the $\frac{\partial}{\partial x^i}$ direction. Application to a vector field will be denoted $\nabla_i \vec{v} $ . For the purposes of this question, I will restrict myself to flat space (namely the plane).

Many introductory sources initially define the Christoffel Symbols by the relationship

$$\frac{\partial \vec{\mathbf{e_i}}}{\partial x^j}=\Gamma^k_{ij}\vec{\mathbf{e_k}}$$

where $\vec{\mathbf{e_i}} = \frac{\partial}{\partial x^i}$ . The covariant derivative is then derived quite simply for contravariant and covariant vector fields as being

$$\nabla_i \vec{v} =\bigg( \frac{\partial v^j}{ \partial x^i } + \Gamma^j_{ik} v^k\bigg) \frac{\partial}{\partial x^j}$$

$$\mbox{and}$$ $$\nabla_i \alpha =\bigg( \frac{\partial \alpha_j}{ \partial x^i } – \Gamma^k_{ij} \alpha_k\bigg) dx^j$$ respectively. Now let's consider the covariant derivative of the covariant basis vector. Observe

$$\nabla_i \vec{\mathbf{e_j}} = \frac{\partial \vec{\mathbf{e_j}}}{ \partial x^i } – \Gamma^k_{ij} \vec{\mathbf{e_k}}$$ $$\mbox{and by our definition of the Christoffel Symbols with symmetric lower indices}$$

$$ \nabla_i \vec{\mathbf{e_j}} = \Gamma^k_{ij} \vec{\mathbf{e_k}} – \Gamma^k_{ij} \vec{\mathbf{e_k}} = \vec{0} \mbox{ .}$$

When I was in my first course on tensors and Riemannian Geometry, we did not arrive at the same result in the plane. We did not use the above definition of the Christoffel Symbols but rather defined them by the geodesic equation (which we arrived at by using the G$\hat{\text{a}}$teaux Variation). I am aware that the intrinsic definition of the Christoffel Symbols

$$\Gamma^k_{ij} = \frac{1}{2}g^{k\ell}\bigg[\frac{\partial g_i\ell}{\partial x^j} + \frac{\partial g_j\ell}{\partial x^i} – \frac{\partial g_ij}{\partial x^\ell}\bigg]$$

is equivalent to the first definition I provided. In our class, we used the following argument for the derivative of the covariant basis.

Let all the components of $v^i $ be $0$ except for the $i^{\text{th}}$ component which is $1$. It is then clear that $\vec{v} = v^i \frac{\partial}{\partial x^i}$ is the invariant (rank $0$) form of the basis vector. The covariant derivative would then be

$$\nabla_i \vec{v} = \nabla_i\bigg( \frac{\partial}{\partial x^k} \bigg) = \bigg( \frac{\partial v^j}{ \partial x^i } + \Gamma^j_{ik} v^k\bigg) \frac{\partial}{\partial x^j} = \bigg( 0 + \Gamma^j_{ik} \bigg) \frac{\partial}{\partial x^j} = \Gamma^j_{ik} \frac{\partial}{\partial x^j}$$ which is clearly not identically $\vec{0}$ .

It is at this point where I turned to the physical/geometric interpretation of the covariant derivative: parallel transport. I walked myself through many examples including the following example of the polar coordinate system in the plane (a nice, flat space).

Consider a vector field $V$ in the polar coordinate system along with the two nearby points $p$ at $(r,\theta)$ and $p'$ at $(r, \theta + \Delta \theta)$. The covariant derivative (w.r.t. the theta covariant basis vector) is said to be the result of parallel transporting the vector $v' = V(p')$ along the direction of a short curve to point $p$ and then subtracting the vectors $v'_{||}-v$ where $v'_{||}$ is the transported vector $v'$ at point $p$. Note that I realize there is also a division by a pathlength parameter and a limit in the definition but this notion should work for arguments sake.

At this point I drew a circle and considered the covariant derivative $\nabla_\theta \bigg( \frac{\partial}{\partial \theta} \bigg)$. This derivative should (if I understand correctly) track the rate of change of the $\frac{\partial}{\partial \theta}$ basis vector using parallel transport along the circle upon which $p$ and $p'$ both reside. If one of the formulations above is correct, it will either come out to be $\vec0$ or $-r\frac{\partial}{\partial r}$ . Drawing this out, it is quite obvious that the vector $v'_{||}$ points slightly inward on this circle. The vectors should be the same length since they were both generated by the vector field $\frac{\partial}{\partial \theta}$ at the same radius and thus the vector subtraction $v'_{||} – v$ points directly inward in the $-\frac{\partial}{\partial r}$ direction. This is looking good for the second formulation! It is also intuitively easy to see in this example that as the radius grows, the length of the $\frac{\partial}{\partial \theta}$ vector also grows and thus the projection of $v'_{||}$ onto $-\frac{\partial}{\partial r}$ would also increase.

PRESTO! The physical intuition matches the second formulation. It may be helpful to note that I am aware that the first definition I provided for the Christoffel Symbols does not extend well to the intrinsic geometry of embedded surfaces. That being said, the plane is nice and flat and exactly what this definition seems to be made for (not to mention the fact that the definition appears in derivations of the covariant derivative for every differentiable object as far as I know).

How can I rectify these seemingly contradictory notions of differentiating basis vectors in flat space or in general?

If anyone is interested, this is where I first saw the covariant derivative sending basis vectors to zero in flat space.

Best Answer

$$\newcommand\ee{\vec{e}} \newcommand\vv{\vec{v}} \newcommand\XX{\vec{X}} $$

Let me start by clarifying your example.

Let $S$ be the unit circle embedded in the plane with the usual parameterization $\phi(\theta)$, and let $\XX$ be a vector field on $S$. This means that each $\XX(\theta)$ is in the tangent space of $\phi(\theta)$, which is the one-dimensional space spanned by $(-\sin\theta, \cos\theta) = \vec{\phi'}(\theta)$.

Let's first motivate what we mean by covariant derivative. So far, $\XX$ is a nice map from $[0,2\pi)$ to the tangent lines of $S$. In general, its ordinary derivative $\XX'$ is going to be in $\mathbb{R}^2$. We can write $\XX(\theta) = f(\theta)\vec{\phi'}(\theta)$, where $f$ is a nice real-valued function. Then $\XX'(\theta) = f'(\theta)\vec{\phi'}(\theta)+f(\theta)\vec{\phi''}(\theta)$, and we note that $\mathbb{R}^2$ is spanned by $\vec{\phi'}(\theta)$ and $\vec{\phi''}(\theta)$. When we want to study $S$ intrinsically, this differentiation is not good enough, because it can give us information which lies outside the tangent lines. So we instead take its projection. Let $\pi(\theta)$ be the projection onto the subspace spanned by $\vec{\phi'}(\theta)$, and instead consider $\pi(\theta) \circ \XX'(\theta)$. We call this new function $\nabla_\theta \XX$; it tells us only about the part of the derivative of $X$ which lies along $S$.

We now turn to the problem of parallel transport. Suppose now that $\XX(0)$ is a tangent vector at $\phi(0)$. We want to roll it along $S$ to get a tangent vector $\XX(\theta)$ at $\phi(\theta)$ that is in some sense equivalent. Now clearly we can take $\XX(\theta) = |\XX(0)|\vec{\phi'}(\theta)$, but it is worth elaborating on the machinery behind this intuitive operation. The key here is that our rolling is in some sense maximally intrinsic. At every step, the intrinsic part of the vector does not change. To formalize this, we say that $\nabla_\theta \XX(\theta) = 0$. This defines the parallel transport of $\XX(0)$.

What if instead of starting with an intrinsic derivative, we started with a notion of parallel transport? Can we recover an intrinsic derivative? Suppose that $\psi(\theta)$ is the map that gives the parallel transport of tangent vectors at $\phi(0)$ to the tangent space at $\phi(\theta)$. This is in fact linear. Let $\XX(\theta)$ be a vector field on $S$. We want to recover the intrinsic component of the infinitecimal change of $\XX$ at $\theta=0$. To do that, let $\delta>0$ be some small change in $\theta$. Can we recover the intrinsic change of $\XX(\delta)$ from $\XX(0)$? Well, we know what $\XX(\delta)$ should look like if there is no intrinsic change at all: it is just the parallel transport $\psi(\delta)(\XX(0))$. So we recover the intrinsic change of $\XX(\delta)$ as the difference between $\XX(\delta)$ and the parallel transport of $\XX(0)$. That is, we recover the covariant derivative as $$\nabla_\theta \XX(0) = \lim_{\delta \to 0} \frac{\XX(\delta)-\psi(\delta)(\XX(0))}{\delta}.$$

Note here that it does not make any sense for the covariant derivative on the circle to be in the $\frac{\partial}{\partial r}$ direction, since that is extrinsic to the circle, and the covariant derivative gives intrinsic information only.

Now we move on to the general case. Let $M$ be a Riemannian manifold, $g$ its metric, and $\nabla$ its connection. The torsion-free condition specifies that for any vector fields $X$ and $Y$ on $M$, $\nabla_X\nabla_Y - \nabla_Y\nabla_X = [X,Y]$. Here $[X,Y]$ is the Lie bracket of vector fields.

To work in coordinates, fix a point $p$, and an open neighborhood $U$ of $p$ with coordinate functions $x^i$. We denote by $\ee_i$ the $i$th tangent vector with respect to these coordinates. The first thing we note is that $[\ee_i,\ee_j] = 0$ simply by the commutativity of the ordinary partial derivative. We don't need any Christoffel symbol machinery to then derive that $\nabla_{\ee_j}\ee_i = \nabla_{\ee_j}\ee_i$, it is a straightforward consequence of the torsion-free condition.

Now we define the symbols $\gamma^k_{ij}$ such that $\nabla_{\ee_i}\ee_j = \gamma^k_{ij}\ee_k.$ Note here that the Christoffel symbols are the coefficients of the covariant derivative, not the ordinary derivative. Be careful with notation.

$$\frac{\partial \vec{\mathbf{e_i}}}{\partial x^j}=\Gamma^k_{ij}\vec{\mathbf{e_k}}$$

Let $\vv$ be a vector field, which is given in components as $v^i\ee_i$. Then we have that $$\nabla_{\ee_i}\vv = \nabla_{\ee_i}(v^j\ee_j) = (\nabla_{\ee_i}v^j)\ee_j + v^j(\nabla_{\ee_i}\ee_j) = \frac{\partial v^j}{\partial x^i}\ee_j + v^j\gamma^k_{ij}\ee_k.$$

Now suppose that $\vv$ is equal to $\ee_l$, so $v^l = 1$ and $v^i=0$ otherwise. We then get that $\frac{\partial v^i}{\partial x^j} = 0$, and thus that $$\nabla_{\ee_i}\vv = \gamma^k_{il}\ee_k = \nabla_{\ee_i}\ee_l.$$

This is a tautology, we have recovered no new information.

Here is where you make the mistake in your derivation.

let's consider the covariant derivative of the covariant basis vector. Observe

$$\nabla_i \vec{\mathbf{e_j}} = \frac{\partial \vec{\mathbf{e_j}}}{ \partial x^i } - \Gamma^k_{ij} \vec{\mathbf{e_k}}$$

You have put here a minus instead of a plus in the right-hand side, which should read: $$= \frac{\partial \vec{\mathbf{e_j}}}{ \partial x^i } + \Gamma^k_{ij} \vec{\mathbf{e_k}}$$

Fixing this in the following steps, and using the corrected definition of Christoffel symbols, you would get:

$$ \nabla_i \vec{\mathbf{e_j}} = \frac{\partial \vec{\mathbf{e_j}}}{ \partial x^i } + \Gamma^k_{ij} \vec{\mathbf{e_k}} = \Gamma^k_{ij} \vec{\mathbf{e_k}},$$ which implies the correct result that $$\frac{\partial \vec{\mathbf{e_j}}}{ \partial x^i } = 0.$$

In general, it is true that the partial derivatives of $\ee_i$ vanish, but the covariant derivatives do not. The Christoffel symbols measure precisely by how much these differ.

Related Question