Leibniz rule for covariant derivative

differential-geometrytensors

I've been learning about the covariant derivative and I have some doubts.

This answer suggests that $\nabla_{\mathbf{u}} T = \nabla T (\mathbf{u})$, where $T$ is a tensor. The tensor $\nabla T$ appears to be acting on the vector $\mathbf{u}$ in the same way a covector acts on a vector to give a scalar.

The answer then proceeds to derive the identity $\nabla^2_{\mathbf{u}, \mathbf{v}} = \nabla_{\mathbf{u}} \nabla_{\mathbf{v}} \mathbf{w} – \nabla_{\nabla_{\mathbf{u}} \mathbf{v}} \mathbf{w}$, where $\mathbf{u}$, $\mathbf{v}$ and $\mathbf{w}$ are vectors.

According to my interpretation, $$\nabla_{\mathbf{u}} \nabla_{\mathbf{v}} \mathbf{w} = \nabla_{\mathbf{u}} (\nabla \mathbf{w} (\mathbf{v})) \\ = \underbrace{(\nabla_{\mathbf{u}} (\nabla \mathbf{w}))}_{\text{a (1,1) tensor}} (\mathbf{v}) + \nabla \mathbf{w} (\nabla_{\mathbf{u}} \mathbf{v}) \\ = \underbrace{\nabla \nabla \mathbf{w}}_{\text{a (1,2) tensor}}(\mathbf{u}, \mathbf{v}) + \nabla_{\nabla_{\mathbf{u}} \mathbf{v}} \mathbf{w} \\ = \nabla^2_{\mathbf{u}, \mathbf{v}} + \nabla_{\nabla_{\mathbf{u}} \mathbf{v}} \mathbf{w} \\ \therefore \nabla^2_{\mathbf{u}, \mathbf{v}} = \nabla_{\mathbf{u}} \nabla_{\mathbf{v}} \mathbf{w} – \nabla_{\nabla_{\mathbf{u}} \mathbf{v}} \mathbf{w}.$$

My confusion arises here. Let $T$ and $S$ be tensors. The above derivation make use of some version of the Leibniz rule that appears to be of the form $\nabla_{\mathbf{u}}(T(S)) = (\nabla_{\mathbf{u}} T)(S) + T(\nabla_{\mathbf{u}} S)$. Is my interpretation correct?

Yet according to this answer, the rule $\nabla (T\otimes S) = \nabla T \otimes S + T\otimes \nabla S$ doesn't exist, but when you add a direction $\mathbf{u}$ $\nabla_{\mathbf{u}} (S\otimes T) = \nabla_\mathbf{u} S \otimes T + S \otimes \nabla_\mathbf{u} T$, it suddenly becomes true. Why?

I'm quite confused by these various versions of the Leibniz rule and the "total covariant derivative" $\nabla$ versus the covariant derivative $\nabla_{\mathbf{u}}$. I appreciate if someone could clear it up for me a little.

Best Answer

You should think about the two "covariant derivatives" $\nabla T$ and $\nabla_u T$ the same way you think about differentials and directional derivatives of scalar functions:

If $f : M \to \mathbb R,$ then the covector field $df$ is defined in terms of the directional derivatives $uf$ by $df(u) = uf.$ In vector calculus, we thought about the gradient instead, and would have written this something like $\nabla f \cdot u = D_u f.$

In exactly the same way, we simply define $\nabla T (u)= \nabla_u T,$ and (after checking that this is indeed tensorial in $u$) we have "bundled up" all the derivatives of the tensor field $T$ into a tensor field of one degree higher.

Your calculation for the second covariant derivative (and the Leibniz rule $$\nabla_u(S \otimes T) = \nabla_u S \otimes T + S \otimes \nabla_u T \tag 1$$ that you used in it) are perfectly correct.

The only reason the rule $$\nabla (T\otimes S) = \nabla T \otimes S + T\otimes \nabla S \tag 2$$ is incorrect is the order of the slots/indices. To make this concrete, let's suppose $S$ and $T$ are covector fields for simplicity. In index notation, the correct Leibniz rule is $$\nabla_i(T\otimes S)_{jk} = (\nabla_i T)_j S_k + T_j (\nabla_i S)_k.$$ Note that the direction of differentiation is always $\partial_i$. On the other hand, the incorrect rule $(2)$ would translate into index notation as $$\nabla_i(T \otimes S)_{jk}=(\nabla_iT)_jS_k+T_i (\nabla_jS)_k.$$ Thus $(2)$ has to be corrected by some transposition of indices, something like $$\nabla(T \otimes S) = \nabla T \otimes S + \operatorname{swap}_{12} (T \otimes \nabla S).$$ I've had to invent this "swap" notation for slot transposition, since (as far as I know) there is no conventional way to write this operation when using index-free notation in DG. Usually, authors take one of the following approaches:

  • Use an index-based notation where transposition (and contraction) of higher-order tensors is simple and intuitive to notate.
  • "Plug in" enough vectors/covectors (treated as free variables) that the transposition becomes unnecessary, as in $(1).$
  • In some cases, just abuse notation and write $(2)$, even though it is technically incorrect. In situations where you're not likely to get the various slots mixed up, it's very neat and conceptually clear.
Related Question