Differential Geometry – Abstract Index Notation Inconsistency

connectionsdifferential-geometryindex-notationmathematical physicstangent-bundle

I am reading this great work here and I am trying to make sense of a specific derivation around the middle of the page. In particular, it seems they are claiming that:

$(\nabla_v w) (f) = (v^{\alpha} \nabla_{\alpha} w^{\beta})\nabla_{\beta}f$

where we work on a smooth manifold $M$, $v, w$ are smooth vector fields (i.e. sections of the tangent bundle $\mathcal{T}(M)$), $f$ is a scalar function and $\nabla$ is the connection on $\mathcal{T}(M)$.

My understanding is that they are using abstract index notation. But when I consider, for simplicity, a local set of coordinates with a local basis $e_i := \frac{\partial}{\partial x^i}$ I get a seemingly different answer. Specifically, if we write $v = v^i e_i$ and $w = w^j e_j$ one gets:

$(\nabla_v w) (f) = \left(\left(v^i \nabla_{e_i} w^k + v^iw^j \Gamma_{ij}^k \right)e_k\right)(f) = \left(v^i \nabla_{e_i} w^k + v^iw^j \Gamma_{ij}^k \right)\nabla_{e_k} f :=(\nabla_v w)^k \nabla_{e_k} f$

where $(\nabla_v w)^k$ means the $k$-th component of the vector $\nabla_v w$ in local coordinates. So my answer seems to be:

$(\nabla_v w) (f) = \left(v^i \nabla_{e_i} w^k\right) \nabla_{e_k} f + \left(v^iw^j \Gamma_{ij}^k \right)\nabla_{e_k} f$

But if their derivation is correct and I am interpreting abstract notation properly, it seems like I should instead be getting:

$ (\nabla_v w) (f) = \left(v^i \nabla_{e_i} w^k\right) \nabla_{e_k} f$

What am I missing?

Edit: I wish I could accept multiple answers. Huge Credit to @peek-a-boo and @Jackozee Hakkiuz for adding incredible insight to the problem. If anyone is reading this in the future, I highly recommend going over both answers.

Edit2: Also for those future readers, I highly suggest this high level debate here. One thing it illustrates clearly is that quantities like $\nabla_{\alpha} w^{\beta}$ are a priori ill-defined which can lead to understandable confusion. As a result, certain notational conventions are required, ones which are often author dependent. This makes it especially challenging for anyone who approaches the topic using different sources, since finding notational inconsistencies is almost inevitable.

Edit3: For my own clarity I would also like to give a formal answer that bridges the gap between the two notations, so here it goes:

$\underline{\textbf{(A posteriori) ANSWER:}}$

Let $\nabla_{\alpha} w^{\beta}:= (\nabla w)_{\alpha}^{\beta}$ be the placeholder (abstract) notation for the $(1,1)$-tensor field $\nabla w$ (said differently, in local coordinates $(\nabla w)_{\alpha}^{\beta}$ corresponds to $(\nabla w)_{k}^{i} (e_i \otimes \epsilon^k)$ or $(\nabla w)_{k}^{i} e_i \epsilon^k$ for short). We will see that this notation is consistent with the numerical one (i.e. the one in local coordinates).

The setup is the same as above but we also add a local co-basis $\epsilon^j := dx^j$. We observe the following:

  • $v = v^k e_k$ (as a $(1,0)$-tensor field, aka a vector field)
  • $\nabla w = (\nabla w)^i_j e_i \epsilon^j$ (as a $(1,1)$-tensor field)
  • $e_i (f) = \nabla_{e_i} f$ (by definition)
  • $\nabla_v w = (\nabla w)(v)$ (by definition)

One can then perform the following calculations:

$$(\nabla_v w)(f) = ((\nabla w)(v))(f) = ((\nabla w)^i_j e_i \epsilon^j v^k e_k)(f) = (v^k(\nabla w)^i_k) e_i(f) = (v^k(\nabla w)^i_k) \nabla_{e_i}f$$

which is "consistent" with the abstract index notation of:

$$(\nabla_v w)(f) = (v^{\alpha} \nabla_{\alpha} w^{\beta})\nabla_{\beta}f = (v^{\alpha} (\nabla w)_{\alpha}^{\beta}) \nabla_{\beta}f$$

$\underline{\textbf{The Subtleties:}}$

There are some subtleties here, I feel they need to be addressed:

  1. The first one is that in some sense we got "lucky" that the two expressions (abstract and numerical) seem to match each other one to one (symbol-wise). This is due to the fact that both results above, correspond to scalars so there are no implicit vectors/covectors "leftover" as implied from the abstract notation. To see this explicitly, observe the following correspondences between the two notations:

$v^{\alpha} \leftrightarrow v^k e_k$

$(\nabla w)_{\alpha}^{\beta} \leftrightarrow (\nabla w)_{l}^{i}e_i \epsilon^l$

$\nabla_{\beta} f \leftrightarrow (\nabla_{e_m} f) \epsilon^m $

This means that when we put everything together we get:

$$ (v^{\alpha} (\nabla w)_{\alpha}^{\beta}) \nabla_{\beta}f \leftrightarrow (v^k e_k (\nabla w)_{l}^{i}e_i \epsilon^l) ((\nabla_{e_m} f) \epsilon^m) = (v^k (\nabla w)_{l}^{i} \nabla_{e_m} f) e_k e_i \epsilon^l \epsilon^m = (v^k (\nabla w)_{k}^{i}) \nabla_{e_i} f $$

Thus, it is only because we have the perfect amount of contractions (i.e. the final result is a $(0,0)$-tensor (aka. a scalar)) that the two notations "match" each other. In the general case, the two notations will not match one another as (in some sense) the abstract notation "implies" the basis vectors while the regular one does not. (For example, even though $v^{\alpha} \leftrightarrow v^k e_k$, the two expressions are not in a "perfect" one to one notational correspondence, symbol-wise).

  1. Ambiguity can be eliminated. If one uses correspondences analogous to the ones seen above, ambiguities can be eliminated. This is best seen by performing calculations on different parentheses placements. For example, we know that in local coordinates:

$$\nabla_v w = v^i \nabla_{e_i} (w^k e_k) = v^i (\nabla_{e_i} w)^k e_k $$ where $ \nabla_{e_i} w = \left(\frac{\partial w^k}{\partial x^i} + w^j \Gamma_{ij}^k \right) e_k$ so that $ (\nabla_{e_i} w)^k = \frac{\partial w^k}{\partial x^i} + w^j \Gamma_{ij}^k$. But then, one can calculate $v^{\alpha}\nabla_{\alpha} w^{\beta}$ in three different ways:

$$\boxed{v^{\alpha} (\nabla_{\alpha} w)^{\beta} \leftrightarrow v^i (\nabla_{e_i} w)^k e_k = \nabla_v w}$$

$$\boxed{ v^{\alpha} (\nabla w)_{\alpha}^{\beta} \leftrightarrow v^i (\nabla w)^k_i e_k = \nabla_v w }$$

$$\boxed{ v^{\alpha} (\nabla_{\alpha}w^{\beta}) \leftrightarrow v^i (\nabla_{e_i}( w^k e_k))= v^i(\nabla_{e_i} w)^k e_k = \nabla_v w }$$

This means that the notation $v^{\alpha}\nabla_{\alpha} w^{\beta}$ is (a posteriori) unambiguous.

  1. Some ambiguity can still arise (if one is not careful). For example, take the notation $\nabla_{\alpha} w^{\beta}$ with one covariant and one contravariant index. In local coordinates, does it correspond to $(\nabla_{e_i} (w^k e_k)) \epsilon^i$ or $(\nabla_{e_i} w^k) \epsilon^i e_k$? In other words, where should the "implied" basis vectors go? Unless we create some notion of priority of operation, we will have an unbridgeable ambiguity, since the two expressions above are different. However, loosely speaking, the operator $\nabla$ acts on the vector $w$ after it is made "whole" (i.e. after we make the correspondence $w^{\beta} \leftrightarrow w^k e_k$), so we need to put the implied basis vectors $e_k$ first:

$$\nabla_{\alpha} w^{\beta} \leftrightarrow (\nabla_{e_i} (w^k e_k)) \epsilon^i = (\nabla_{e_i} w)^k e_k \epsilon^i$$

Similarly, the hessian of a smooth scalar function has a unique correspondence in local coordinates, given by:

$$\color{red}{\nabla_{\alpha} \nabla_{\beta} f \leftrightarrow (\nabla_{e_i} [(\nabla_{e_j} f) \epsilon^j])\epsilon^i} = \dots = [(\partial_i \partial_j f) – (\partial_k f)\Gamma_{ij}^k]\epsilon^j\epsilon^i $$

From there, we can derive the expression of the Torsion (defined by $T(f) = (\nabla_{\alpha} \nabla_{\beta} – \nabla_{\beta} \nabla_{\alpha}) (f)$) when acting on the scalar :

$$T(f) = [(\Gamma_{ij}^k – \Gamma_{ji}^k) – [e_i, e_j]^k](\partial_k f)\epsilon^i \epsilon^j $$

And also the expression for Torsion itself as a (1,2) vector field:

$$\boxed{T = [(\Gamma_{ij}^k – \Gamma_{ji}^k) – [e_i, e_j]^k]\epsilon^i \epsilon^j e_k} $$

Best Answer

What you’re missing is an understanding of where the implicit brackets lie.

It is indeed correct that (by definition of what it means for a vector field to act on a function) $(\nabla_vw)(f)=(\nabla_vw)^k\nabla_{e_k}f$. But this is also what the abstract index notation is saying. Let’s stick to greek for abstract indices. Then:

  • $\nabla_{\beta}f$ stands for the $(0,1)$ tensor field $\nabla f\equiv df$.
  • $\nabla_{\alpha}w^{\beta}$ stands for the $(1,1)$ tensor field $\nabla w$ (recall that if $T$ is an $(r,s)$ tensor field then $\nabla T$ is an $(r,s+1)$ tensor field).
  • Now for contractions: $v^{\alpha}\nabla_{\alpha}w^{\beta}$ stands for the contraction of the $(1,1)$ tensor field $\nabla w$ with the $(1,0)$ tensor field, i.e vector field, $v$ to get $(\nabla w)(v)=\nabla_vw$. Finally, $v^{\alpha}\nabla_{\alpha}w^{\beta}\nabla_{\beta}f$ thus stands for the contraction of $\nabla_vw$ with $\nabla f=df$, which by definition is also what it means for a vector field to act on a smooth function, i.e $(\nabla_vw)(f)$.

Let us use latin indices for ‘actual’ indices. An expression like $\nabla_iw^j$ should NOT be interpreted as $\nabla_{e_i}(w^j)$, instead it should be interpreted as $(\nabla_{e_i}w)^j$. You’ll often see brackets omitted, so when in doubt, do the covariant derivatives first, and only then extract the components.


For instance, you may see the definition of the hessian of a smooth function written as $\nabla_{\alpha}\nabla_{\beta}f$. This stands for the $(0,2)$ tensor field $\nabla(\nabla f)$. The $(i,j)$ components of this tensor field are NOT $\nabla_{e_i}(\nabla_{e_j}(f))$. They’re more complicated.

Related Question