Given a Riemannian manifold $(M,g)$ one often defines the Curvature of the Levi-Civita connection $\nabla$ as $$R(X,Y)=\nabla_X\nabla_Y – \nabla_Y\nabla_X – \nabla_{[X,Y]}$$ for vector fields $X$ and $Y$. However, some authors define the curvature as $$R=\nabla \circ \nabla.$$ These two definitions should coincide, but I have some difficulties proving this. Does anyone here know a resource that goes over this?
Differential Geometry – Defining Curvature in Two Ways
differential-geometryriemannian-geometrysmooth-manifolds
Related Solutions
To show $\Psi_t$ of $V$ is an isometry for each $t$, that is, \begin{equation*} g_{\Psi_t(x)}(D_x\Psi_t X, D_x\Psi_t Y)=g_x(X,Y)\quad\forall\,\,X,Y\in T_p M \end{equation*} Instead, we shall show \begin{equation*} \frac{\partial}{\partial t}( X_i, X_j) =0, \end{equation*} which is the same as showing \begin{equation*} \frac{\partial}{\partial t}(D_x\Psi_t X, D_x\Psi_t Y)=\text{const}\quad\dagger \end{equation*} Proof: $V$ is a killing field, then we have for every $p\in M$ and $X,Y\in T_p M$. \begin{equation*} g(\nabla_X V,Y)+g(X,\nabla_Y V)=0 \end{equation*} Write $Y=V_i=\frac{\partial}{\partial x_i}$,\quad $X= V_j=\frac{\partial}{\partial x_j}$. Since the partial derivative commute in $\mathbb{R}^n$, that is, \begin{equation*} [V_i,V_j]=0\quad\forall\,\,i,j\,\,\text{and}\,\,[V, V_i]=[V_n,V_i]=0 \end{equation*} It suffices to show that $V$ is a Killing field if and only $L_V g(V_i,V_j)=0$ for all $i,j$. First note that local flows for any given time $t$ always give local diffeomorphisms, since their inverse is provided by the local flow of the vector field $-V$. Thus, for $V$ to be a Killing field near $x$ is equivalent to have the property $(u,v)=((D_x\Psi_t u, D_x\Psi_t v))$ for all $u,v\in T_p M$, and all $p$ near $x$. We claim moreover that for fixed $x_1,\cdots,x_{n-1}$, as $t$ varies the coefficients of $D_x\Psi_t u$ and $D_x\Psi_t v$ in term of the $V_i$ are constant. Then \begin{align*} &\ g(\nabla_{V_j}V,V_i)+g(\nabla_{V_i}V,V_j)\\ = &\ g([V_j,V]-\nabla_V V_j, V_i)+g([V_i,V]-\nabla_V V_i,V_j)\\ = &\ ([V_j,V],V_i)-(\nabla_V V_j,V_i)+([V_i,V],V_j)-(\nabla_V V_i,V_j)\\ = &\ -g([V,V_j],V_i)-g([V,V_i],V_j)-{\color{blue}g((\nabla_V,V_j,V_i)+(\nabla_V V_i,V_j))}\\ = &\ -([V,V_j],V_i)-([V,V_i],V_j)-{\color{blue}g(\nabla_V V_j\cdot V_i+ V_j\nabla_V V_i)}\\ = &\ -([V,V_j],V_i)-([V,V_i],V_j)-V(V_j,V_i)\\ = &\ -\frac{\partial}{\partial t}g(V_j,V_i)\\ = &\ -\frac{\partial}{\partial t}g(D_x\Psi_t V_j, D_x\Psi_t V_i) \end{align*} The last two lines indicate $\Psi_t$ of $V$ is an isometry of $(M,g)$ for each $t$.
I always feel the minus sigh looks suspicious…
Any correction is appreciated.
This is an excellent question. As indicated by the MathOverflow link in the comments, there are many ways to think about torsion and torsion-freeness. At the risk of being repetitive, allow me to summarize some of these, adding my own thoughts.
Throughout, we let $M$ be a smooth manifold, $\nabla$ a connection on $TM$, and $$T^\nabla(X,Y) = \nabla_XY - \nabla_YX - [X,Y]$$ its torsion tensor field. We let $X$, $Y$ denote vector fields.
Initial Observations
(1) Parallel coordinates
Torsion (at a point) can be seen as the obstruction to the existence of parallel coordinates (at that point):
Fact: Let $p \in M$. Then $T^\nabla|_p = 0$ if and only if there exists a coordinate system $(x^i)$ centered at $p$ such that $\nabla \partial_i |_p = 0$.
The point here is that if $T^\nabla = 0$, then any parallel frame is commuting (i.e.: $\nabla E_i = 0$ $\forall i$ $\implies$ $[E_i, E_j] = 0$ $\forall i,j$), hence is a coordinate frame (by the "Flowbox Coordinate Theorem").
(2) Commuting of second partials
The following two facts indicate that torsion can be thought of as the obstruction to (certain types of) second partial derivatives commuting.
For a smooth function $f \colon M \to \mathbb{R}$, recall that its covariant Hessian (or second covariant derivative) is the covariant $2$-tensor field defined by $$\text{Hess}(f) := \nabla \nabla f = \nabla df.$$ Explicitly, $\text{Hess}(f)(X,Y) = (\nabla_X df)(Y) = X(Yf) - (\nabla_XY)(f)$.
Fact [Lee]: The following are equivalent:
(i) $T^\nabla = 0$
(ii) The Christoffel symbols of $\nabla$ with respect to any coordinate system are symmetric: $$\Gamma^k_{ij} = \Gamma^k_{ji}$$
(iii) The covariant Hessian of any smooth function $f$ is symmetric: $$\text{Hess}(f)(X,Y) = \text{Hess}(f)(Y,X)$$
Torsion-freeness also implies another kind of symmetry of second partials:
Symmetry Lemma [Lee]: If $T^\nabla = 0$, then for every smooth family of curves $\Gamma \colon (-\epsilon, \epsilon) \times [a,b] \to M$, we have $$\frac{D}{ds} \frac{d}{dt} \Gamma(s,t) = \frac{D}{dt} \frac{d}{ds} \Gamma(s,t).$$
I don't know for certain whether the converse to the Symmetry Lemma is true, but I imagine it is.
Some Heuristic Interpretations
(i) "Twisting" of parallel vector fields along geodesics
Suppose we have a connection $\nabla$ on $\mathbb{R}^n$ whose geodesics are lines, but that has torsion. One could then imagine that parallel translating a vector along a line results in the vector "spinning" along the line, as if one were holding each end of a string and rolling it between our fingers.
An explicit example of such a connection is in the MathOverflow answer linked in the comments.
The justification for why this interpretation should be believed in general will be discussed below in (B).
On the MO thread, Igor Belegradek points out two related facts:
Fact [Spivak]:
(1) Two connections $\nabla^1$, $\nabla^2$ on $TM$ are equal if and only if they have the same geodesics and torsion tensors.
(2) For every connection on $TM$, there is a unique torsion-free connection with the same geodesics.
(ii) Closing of geodesic parallelograms (to second order)
Let $v, w \in T_pM$ be tangent vectors. Let $\gamma_v$ and $\gamma_w$ be the geodesics whose initial tangent vectors are $v$, $w$, respectively. Consider parallel translating the vector $w$ along $\gamma_v$, and also the vector $v$ along $\gamma_w$. Then the tips of the resulting two vectors agree to second order if and only if $T^\nabla|_p = 0$.
Heuristic reasons for this (and a picture!) are given in this excellent answer by Sepideh Bakhoda.
A precise proof of this fact is outlined by Robert Bryant at the end of this MO answer of his.
More Reasons We Like $T^\nabla = 0$
(A) Submanifolds of $\mathbb{R}^N$ come with torsion-free connections
Suppose $(M,g)$ is isometrically immersed into $\mathbb{R}^N$.
As hinted in the comments, the euclidean connection $\overline{\nabla}$ on $\mathbb{R}^N$ is torsion-free. It is a fact that the tangential component of $\overline{\nabla} = \nabla^\top + \nabla^\perp$ defines an induced connection on $M \subset \mathbb{R}^N$. This induced connection on $M$ will then also be torsion-free (and compatible with the induced metric).
Point: If $(M,g) \subset \mathbb{R}^N$ is an isometrically immersed submanifold, then its induced connection is torsion-free.
This example is more general than it seems: by the Nash Embedding Theorem, every Riemannian manifold $(M,g)$ can be isometrically embedded in some $\mathbb{R}^N$.
(B) $T = d^\nabla(\text{Id})$
[I'll add this another time.]
(C) Simplification of identities
Finally, I should mention that $T^\nabla = 0$ greatly simplifies many identities.
First, we have the Ricci Formula $$\nabla^2_{X,Y}Z - \nabla^2_{Y,X}Z = R(X,Y)Z - \nabla_{T^\nabla(X,Y)}Z.$$ Thus, in the case where $T^\nabla = 0$, we can interpret the curvature $R(X,Y)$ as the obstruction to commuting second covariant derivatives of vector fields.
In the presence of torsion, the First and Second Bianchi Idenities read, respectively, $$\mathfrak{S}(R(X,Y)Z) = \mathfrak{S}[ T(T(X,Y),Z) + (\nabla_XT)(Y,Z)],$$ $$\mathfrak{S}[(\nabla_XR)(Y,Z) + R(T(X,Y),Z)] = 0,$$ where $\mathfrak{S}$ denotes the cyclic sum over $X,Y,Z$.
References
[Lee] "Riemannian Manifolds: An Introduction to Curvature"
[Spivak] "A Comprehensive Introduction to Differential Geometry: Volume II"
Best Answer
Using the notation $\nabla$ for the second definition is something I absolutely abhor, because they’re really talking about the exterior covariant derivative.
Let $(E,\pi,M)$ be a vector bundle equipped with a linear connection $\nabla$. So, by definition $\nabla$ eats a section $\psi$ of $E$ and produces $\nabla_{(\cdot)}\psi$, which has an open slot where it can eat a tangent vector $h_x\in T_xM$ to output a vector $\nabla_{h_x}\psi\in E_x$; so there’s is a $1$-form aspect because it takes a tangent vector $h_x$ as input, and there’s a bundle-valued aspect. So, $\nabla$ is a certain map $\Omega^0(M;E)\to\Omega^1(M;E)$. We can generalize this gives us a bunch of operators $d_{\nabla}$ which map, for each $k\geq 0$, $\Omega^k(M;E)\to \Omega^{k+1}(M;E)$, i.e $E$-valued $k$-forms on $M$ to $E$-valued $(k+1)$-forms on $M$. The operators $d_{\nabla}$ are called the exterior covariant derivative (on $E$, relative to $\nabla$). The idea behind this is similar to how from the usual differential of a function, we get exterior derivatives of forms of all orders.
You don’t need to know the specifics of this yet, but all you need to know is that it obeys a similar type of product rule to the usual exterior derivative. As a result, something very cool happens at the level of second exterior covariant derivatives. Let $\psi\in\Omega^0(M;E)$ be any smooth section, and let $f\in C^{\infty}(M)$ any smooth function. Then, by the product rule \begin{align} d_{\nabla}(f\psi)&=df\wedge \psi+ (-1)^0f\cdot d_{\nabla}\psi= df\wedge \psi+f\cdot d_{\nabla}\psi \end{align} (actually since we’re at $k=0$, we can write this equality as $\nabla(f\psi)=df\cdot \psi+ f\cdot\nabla\psi$… or for the first term people often write $df\otimes \psi$ as well). But now let us differentiate again; this time we really need to use $d_{\nabla}$, we cannot use $\nabla$ anymore: \begin{align} d_{\nabla}^2(f\psi)&=d_{\nabla}(d_{\nabla}(f\psi))\\ &=d_{\nabla}(df\wedge \psi+ f\cdot d_{\nabla}\psi)\\ &=[d^2f\wedge \psi + (-1)^1df\wedge d_{\nabla}\psi]+ [df\wedge d_{\nabla}\psi+ (-1)^0f\cdot d_{\nabla}^2\psi]\\ &=f\cdot d_{\nabla}^2\psi, \end{align} where we have used that the first term involving $d^2f$ is $0$ because $d^2=0$ for usual forms. The second and third term cancel out, and so only the last term remains. Therefore, we have shown that the operator $d_{\nabla}^2:\Omega^0(M;E)\to\Omega^2(M;E)$ is $C^{\infty}(M)$-linear, i.e value of $d_{\nabla}^2\psi$ at a point $x$ depends only on the value of $\psi$ exactly at $x$, rather than a neighbourhood of $x$. Thus, by the “tensor characterization lemma” this is equivalent to a bundle morphism $R:\bigwedge^2(TM)\to \text{End}(E)$, or equivalently, to an $\text{End}(E)$-valued $2$-form on $M$. So, more explicitly, for any section $\psi\in \Omega^0(M;E)$ any $x\in M$, any $h_x,k_x\in T_xM$ we have that $(d_{\nabla}^2\psi)_x(h_x\wedge k_x)\in E_x$ is equal to the value of $R(h_x\wedge k_x)\in \text{End}(E_x)$ on $\psi(x)\in E_x$: \begin{align} (d_{\nabla}^2\psi)_x(h_x\wedge k_x)&=R(h_x\wedge k_x)[\psi(x)]\in E_x. \end{align} So, this is the more detailed explanation of the equation $R=d_{\nabla}^2$.
Relating to the first formula.
Above, I simply said quickly that “we can generalize this to give a bunch of operators $d_{\nabla}$”, but I didn’t tell you how exactly. Well, one can define $d_{\nabla}$ to be the unique collection of operators which are linear, such that the correct product rule holds, and such that $d_{\nabla}\psi=\nabla\psi$ for all sections $\psi$ of $E$ (i.e for $E$-valued $0$-forms on $M$). This would be analogous to how almost everyone defines the usual exterior derivatives (it’s the unique operator such that it is linear, and $df$ is defined as usual for functions $f$, and such that $d^2=0$, and such that a suitable product rule holds).
Alternatively, one can equivalently define $d_{\nabla}$ by an explicit formula analogous to the Palais formula for usual differential forms; we simply replace the Lie derivative in the first summation by the covariant derivative $\nabla$. So, super explicitly, for each $\omega\in \Omega^k(M;E)$, we define $d_{\nabla}\omega$ by declaring it eats $(k+1)$ many vector fields $X_0,\dots, X_k$ on $M$, such that its value on these vector fields is \begin{align} (d_{\nabla}\omega)(X_0,\cdots, X_k)&:=\sum_{j=0}^p(-1)^j\nabla_{X_j}\bigg(\omega(X_0\wedge\cdots\wedge \widehat{X_j}\wedge\cdots\wedge X_k)\bigg)\\ &+\sum_{0\leq i<j\leq k}(-1)^{i+j}\omega([X_i,X_j]\wedge X_0\wedge\cdots\wedge\widehat{X_i}\wedge\cdots\wedge\widehat{X_j}\wedge\cdots\wedge X_k).\tag{$*$} \end{align} One has to prove a bunch of things about this formula, the first of which is that the RHS is alternating in the $X_j$’s, and that it is $C^{\infty}(M)$-linear with respect to each of the $X_j$’s. Once this check is made, we can be sure that the object $d_{\nabla}\omega$ really does lie in $\Omega^{k+1}(M;E)$, and so we have a well-defined map $d_{\nabla}:\Omega^k(M;E)\to \Omega^{k+1}(M;E)$.
To show the equivalence of these two approaches, it suffices to show that this explicit formula is also linear in $\omega$, and that it satisfies a correct product rule (and obviously when $k=0$, the RHS only has one term, namely $\nabla_{X_0}\omega$, which means $d_{\nabla}=\nabla$ for $k=0$). Checking these details is a little annoying to do, and I’m not going to do it here, but I’m sure they’re done in detail for the case of the usual exterior derivative $d$ in Spivak’s A Comprehensive Introduction to Differential Geometry, Vol I. For this vector-bundle case, you mimic the same algebraic steps, by appropriately replacing all instances of the Lie derivative $L_{X_j}\left(\omega(X_0\wedge\cdots\wedge \widehat{X_j}\wedge\cdots\wedge X_k)\right)\equiv X_j\left( \omega(X_0\wedge\cdots\wedge \widehat{X_j}\wedge\cdots\wedge X_k)\right)$ (i.e the vector field $X_j$ acting on a function) with the covariant derivative $\nabla_{X_j}$ acting on a section.
So, with the equivalence of these approaches in mind, we can proceed easily:
So, if we want to calculate $d_{\nabla}^2\psi$ for $\psi\in \Omega^0(M;E)$, then applying these two formulas immediately gives us \begin{align} (d_{\nabla}^2\psi)(X\wedge Y)&=\nabla_X((d_{\nabla}\psi)(Y))-\nabla_Y((d_{\nabla}\psi)(X))-(d_{\nabla}\psi)([X,Y])\\ &=\nabla_X\nabla_Y\psi-\nabla_Y\nabla_X\psi-\nabla_{[X,Y]}\psi. \end{align} Since the LHS is by our second definition equal to $R(X\wedge Y)\cdot \psi$, we have thus proven the equivalence of the first and second definitions.
Extra Remarks.
Now, specialize to a connection $\nabla$ in the tangent bundle $TM$. Here, we have a very special map, namely the identity map $I:TM\to TM$, which can be viewed as a $TM$-valued $1$-form on $M$. As such, we can compute its exterior covariant derivative $d_{\nabla}I$. This will be a $TM$-valued $2$-form on $M$. Its value on a pair of vector fields $X,Y$ on $M$ is (use the second bullet point formula for $k=1$ from above): \begin{align} (d_{\nabla}I)(X\wedge Y)&=\nabla_X\bigg(I(Y)\bigg)-\nabla_Y\bigg(I(X)\bigg)-I([X,Y])\\ &=\nabla_XY-\nabla_YX-[X,Y]. \end{align} You may recognize this as precisely being the value of the torsion $T(X\wedge Y)$. So, this proves that the torsion of a connection in the tangent bundle is the exterior-covariant derivative of the identity morphism: $T=d_{\nabla}I$, and it is thus a $TM$-valued $2$-form on $M$ (in components, $T^{a}_{\,bc}$, this is the reason for the skew-symmetry in the bottom two indices). Or of course, we can reverse perspective, and define the torsion to be $T:=d_{\nabla}I$ (which is actually what I prefer), and then deduce the above formula for its values on a pair of vector fields.
Here are some related MSE answers of mine you may find helpful if you’re going down the rabbit hole of bundle-valued forms
The first link contains a summary of all the essential things one needs to know about bundle-valued forms (the proofs are almost line-by-line adaptations of those in the usual scalar-valued case).