# [Physics] Intuition on the covariant derivative

differential-geometrydifferentiation

I'm having some trouble understanding the covariant derivative as a directional derivative for tensors. The way the covariant derivative was presented to me was by first showing that a vector field can provide a directional derivative for smooth functions on a manifold.

We then asked how we could get the directional derivative for higher rank tensors. We started by listing a number of qualities we wanted our new derivative operator to have. Since we want to have a directional derivative we need a vector to provide a direction and a tensor to derive, so the covariant derivative would be a map from a vector x tensor of type $(s,k)$ to another tensor of type $(s,k)$.

Since this was a derivative we wanted linearity, and we also wanted it to obey the Leibniz rule. We then talked how based on these rules we had some freedom in how we picked the derivative, which we later showed can be fix based on your choice of the connection coefficients. Finally, we applied the covariant derivative to a vector in some particular chart, and got the component expression on how to take the covariant derivative of a vector.

I could follow all of the algebraic manipulations up to this point, however none of this jives with my understanding of a derivative, which is something that measures how something else changes.

We've additionally talked about parallelity and the autoparallel equation $\nabla(V,V) = 0$ (where $\nabla$ is the covariant derivative and $V$ is a vector) but none of this has really helped me understand the covariant derivative as a directional derivative. Could anyone help me out?
Thanks.

Let $M$ be a smooth manifold. Then, we can construct a vector space $(C^\infty(M),+,\cdot)$, where $C^\infty(M)$ is the set of all infinitely many times differentiable (i.e smooth) maps $f:M\to\mathbb R$, $+$ is the pointwise addition and $\cdot$ is the pointwise s-multiplication. Now let $\gamma:\mathbb R\to M$ be a smooth curve through the point $p\in M$. W.l.o.g we can consider $p=\gamma(0)$ and we define the directional derivative operator at the point $p$ along the curve $\gamma$ as the linear map: $$X_{\gamma,p}:C^\infty(M)\tilde{\to}\mathbb R$$ $$:f\mapsto(f\circ\gamma)'(0)$$ where $f\circ\gamma:\mathbb R\to\mathbb R$ and $(f\circ\gamma)'(0)\in\mathbb R$.
We further define the set $$T_pM:\{X_{\gamma_a,p}\;\mid\;\text{all curves passing through}\; p\in M\}$$ and equip it with the closed operations $\oplus,\odot$: $$\oplus:T_pM\times T_pM\to T_pM$$ $$\odot:\mathbb R\times T_pM\to T_pM$$ which stand for vector addition and s-multiplication on $T_pM$. There is no need for further definition of the target element of these two maps for now. This set, together with these two operations, constitute the tangent vector space, i.e the space whose elements are the directional derivative operators.
As $X\in T_pM$ are operators (in the above frame), we can understand their behaviour by acting them on a smooth function $f:M\to\mathbb R$. So let us choose a chart $(U,x)$, which is an element of the smooth atlas $\mathscr A_M$ on $M$. Let us also consider $\dim(M)$-many curves $\gamma_i$, s.t $$\gamma_i:\mathrm{preim}_{\gamma_i}(U)\to U$$ with $(x^b\circ\gamma_i)(\lambda)=\delta^b_a\lambda$ for $i=1,\cdots,\dim(M)$. Obviously, $U$ is an open subset of $M$, $\lambda$ is the curve parameter, $x^b$ is the b-th component of $x$ and is smooth, because $x:U\to x(U)\subseteq\mathbb R^d$ is smooth due to the atlas being smooth. The index $i$ is just the curve index. For convenience we choose a chart, s.t $x(p)=0_{\mathbb R^d}$. As you see, we did very specific choices. We now act on smooth f with our operator: \begin{align*} X_{\gamma_i,p}(f):&=(f\circ\gamma_i)'(0)\\ &=(f\circ x^{-1}\circ x\circ \gamma_i)'(0)\\ &=\partial_b(f\circ x^{-1})(x(\gamma_i(0)))(x^b\circ\gamma_i)'(0)\\ &=\partial_b(f\circ x^{-1})(x(p))\delta^b_a\\ &=\partial_a(f\circ x^{-1})(x(p))=:\left(\dfrac{\partial}{\partial x^a}\right)_p(f) \end{align*} which means that $$\left(\dfrac{\partial}{\partial x^a}\right)_p:=X_{\gamma_i,p}$$ In order to go from the second equation to the third, we used the chain rule (funny looking that way), where $(\partial_a)_{x(p)}:C^\infty(\mathbb R^d)\to\mathbb R$. The key fact here is that in order to define the tangent vector we need a chart,i.e we need coordinates, so that the partial derivative has meaning as derivative wrt to a component of the coordinate map. On a manifold, we cannot directly have such thing. One also proves that this definition satisfies the Leibnitz rule and that $$\left(\dfrac{\partial}{\partial x^i}\right)_p\qquad,i=1,\cdots,\dim(M)$$ are a linearly independent generating system of the tangent vector space $T_pM$, a.k.a a basis. Obviously, acting this thing on vector fields requires understanding of tangent bundles $TM$, vector fields (as sections of TM),modules,rings etc.