Using $g^{ab}\nabla_a\nabla_b\phi$ show that $\nabla^2 \phi=\left(\partial_r^2+\frac1r\partial_r+\frac{1}{r^2}\partial_\theta^2\right)\phi$

calculusgeneral-relativitylaplacianpartial derivativesolution-verification

I am having a hard time understanding the proof to the following question:

In a two dimensional space with the usual metric $\mathrm{d}s^2 = \mathrm{d}x^2 +\mathrm{d}y^2$ the Laplacian can be written as
$$\nabla^2\phi=g^{ab}\partial_a\partial_b\phi\tag{a}$$
As the components of the metric are constants $\Gamma_{bc}^a=0$. Therefore the Laplacian can be written as $$\nabla^2\phi=g^{ab}\nabla_a\nabla_b\phi\tag{b}$$ which is invariant under any change of coordinates. Use this to show that in polar coordinates $$\nabla^2 \phi=\left(\frac{\partial^2}{\partial r^2}+\frac{1}{r}\frac{\partial}{\partial r}+\frac{1}{r^2}\frac{\partial^2}{\partial \theta^2}\right)\phi$$

Before I typeset the author's solution I need to give some information needed for this question:
The metric for 2d polar coordinates is $\mathrm{d}s^2=\mathrm{d}r^2 +r^2\mathrm{d}\theta^2$ with inverse metric components that will be needed later:
$g^{rr}=1,\,$and $g^{\theta\theta}=\frac{1}{r^2}$.
The $3$ non-zero Christoffel symbols are $\Gamma^{r}_{\theta\theta}=-r,\,$ and $\Gamma^{\theta}_{r\theta}=\Gamma^{\theta}_{\theta r}=\frac1r$. Finally, for a covariant vector field,
$$\nabla_c v_b=\partial_c v_b-\Gamma_{bc}^{a}v_a\tag{1}$$

$$\begin{align} g^{ab}\nabla_a \nabla_b\phi&\stackrel{[1]}{=}g^{ab}\color{red}{\nabla_a}\partial_b\phi\\&\stackrel{[2]}{=}g^{ab}\left(\partial_a\partial_b-\Gamma_{ba}^{d}\partial_d\right)\phi\\&\stackrel{[3]}{=}g^{rr}\left({\partial_r}^2-\Gamma_{rr}^{d}\partial_d\right)\phi+g^{\theta\theta}\left({\partial_{\theta}}^2-\Gamma_{\theta\theta}^{d}\partial_d\right)\phi\\&\stackrel{[4]}{=}g^{rr}\left({\partial_r}^2-0\right)\phi+g^{\theta\theta}\left({\partial_{\theta}}^2-\Gamma_{\theta\theta}^{r}\partial_r\right)\phi\\&\stackrel{[5]}{=}{\partial_r}^2\phi+\frac{1}{r^2}\left({\partial_\theta}^2+r\partial_r\right)\phi\\&=\left(\frac{\partial^2}{\partial r^2}+\frac{1}{r}\frac{\partial}{\partial r}+\frac{1}{r^2}\frac{\partial^2}{\partial \theta^2}\right)\phi\end{align}$$

I have numbered each equality in the proof above and will ask questions and make comments to ensure I have understood each step:

1 How and why did the covariant derivative, $\nabla_b$ on the LHS get changed to a partial derivative on the RHS? Why is the same not true for $\nabla_a$ marked in red, after all, eqn $(\mathrm{a})$ and $(\mathrm{b})$ says that these two expressions are equal so why just change the $\nabla_b$?

2 In this case the expression $(1)$ for the covariant derivative seems to be being used for $\nabla_a$, but why did this not take place in the first line ($[1]$).

3 In this step it seems that we are setting $a=b=r$ for the first bracket then $a=b=\theta$ for the second bracket.

4 I also think I understand the logic in this equality, the Christoffel symbol in the first bracket, $\Gamma_{rr}^{d}=0$ since it is not one of the three out of eight Christoffel symbols mentioned above in this post that are non-zero. In the second bracket to get a non-zero contribution from the Christoffel symbol we set $d=r$.

[5] This stage I think I understand, the values $g^{rr}=1$ and $g^{\theta\theta}=\frac1r$ have been substituted for the inverse metrics, also $\Gamma^{r}_{\theta\theta}=-r$ has been substituted in.


I think I have understood almost all the steps and it is just steps 1 and 2 that are causing problems. I think that this is an important proof to understand properly which is why I'm asking about it.

Here are screenshots of the question and solution, just in case I have made any typos:

Tensor calculus question


Tensor calculus solution


Recently found resource on the covariant derivative:

Covariant derivative part 1
Covariant derivative part 2

In the image above I have highlighted the expression to which this question is solely about; trying to understand why $\nabla_b=\partial_b$. For a scalar field $\phi$, $\nabla_b \phi =\partial_b \phi$, but in $(\mathrm{a})$ $\partial_b \phi$ is still a scalar.

So let $\psi=\partial_b \phi$, then by my logic $\nabla_a\psi=\partial_a \psi$, so no reason why $\nabla_a$ cannot be written as $\partial_a$ also. This is the essence of eqns $(\mathrm{a})$ and $(\mathrm{b})$. Yet in the very first equality in the proof, $[1]$, only the $\nabla_b \to \partial_b$, and I just don't understand why.

Best Answer

Parts [1] and [2]

The essence of why this is the case comes down to the fact that polar coordinates will form a set of orthonormal coordinates away from the origin. Other names for these coordinates include "Geodesic Coordinates" and "Normal Coordinates," and they tend to be incredibly useful. This is because, at the origin of these coordinates, the metric looks like the Euclidean Metric.

The central part of your question boils down to how the connection given by $\nabla$ in the tangent bundle, $TM$, acts on $(0,0)$ and $(1,0)$ tensors. Remember that the entire point behind defining a connection on a tangent bundle was to generalize the notion of directional derivatives, and so we will be keeping this in mind as we Without getting too much into the details of tensors, we just need to note a couple of things about the connection.

If we assume that our tangent bundle is given a smooth local frame $\{E_i\}$ (for example, in $\mathbb{R}^2$ we would have that the local frame is given by the vector fields $\{\partial_x,\partial_y\}$, and in polar coordinates, these are given by $\{\partial_r,\partial_\theta\}$), then we are acting on vector fields $X,Y \in T^{(1,0)}TM \cong TM$, we have that the map is defined by

$$\nabla:T^{(1,0)}TM \times T^{(1,0)}TM \rightarrow T^{(1,0)}TM:(Y,X)\mapsto\nabla_X Y = (X(Y^k) + X^iY^i\Gamma^k_{ij})E_k$$

However, this is the action on vector fields. When we want to act on functions (which are $(0,0)$-tensors), then we need to change things a little. Again, we want the connection to be our generalization of the directional derivative, so we must have that

$$\nabla:T^{(0,0)}TM \times T^{(1,0)}TM \rightarrow T^{(1,0)}TM:(f,X)\mapsto\nabla_X f = X(f)$$

And if we were to have that $X = E_i = \partial_i$, then we would have that $$\nabla_X f = \nabla_{\partial_i}f = \partial_i[f]$$

As a small note, in the above presentation, I have not really given any reference to the metric that we are using on the space, but if you require a general connection $\nabla$ is both symmetric and compatible with the metric $g$, then you will necessarily have the Levi-Civita connection.

This gets to your central question. We have that $\nabla_a\nabla_b = \nabla_a(\nabla_b)$ are going to be acting on different things. We will expect that this thing will act on a function, so the innermost derivative will act according to the directional derivative formula, meaning we may write $\nabla_b = \partial_b$. However, underneath our composition, the outside derivative is acting on the vector field $\partial_b$, so it acts according to the formula $\nabla_a(\partial_b) = \partial_a\partial_b + \Gamma^d_{ab}\partial_d$.

Part [3]

I think that you may have gotten a little confused with the notation here. We are using the Einstein summation convention here, so

$$\begin{align} g^{ab}(\partial_a\partial_b - \Gamma^d_{ab}\partial_d) &= g^{rr}\left(\partial_r\partial_r - \sum_d\Gamma^d_{rr}\partial_d\right)\\ &\qquad +g^{r\theta}\left(\partial_r\partial_\theta - \sum_d\Gamma^d_{r\theta}\partial_d\right)\\ &\qquad +g^{\theta r}(\partial_\theta\partial_r - \sum_d\Gamma^d_{\theta r}\partial_d)\\ &\qquad +g^{\theta\theta}\left(\partial_\theta\partial_\theta - \sum_d\Gamma^d_{\theta\theta}\partial_d\right) \end{align}$$

This is where the fact that we are using polar coordinates becomes important. The coordinate directions are orthogonal, so $g^{r\theta} = g^{\theta r} = 0$, and the formula reduces to

$$\begin{align} g^{ab}(\partial_a\partial_b - \Gamma^d_{ab}\partial_d) &= g^{rr}\left(\partial_r\partial_r - \sum_d\Gamma^d_{rr}\partial_d\right) +g^{\theta\theta}\left(\partial_\theta\partial_\theta - \sum_d\Gamma^d_{\theta\theta}\partial_d\right) \end{align}$$

Parts [4] and [5] These look good!

Let me know if this helped!


Additional Section Added For Clarity

In the passage that you have pictured in your post (thank you for doing that by the way), they are specifically highlighting the fact that if you move from a connection defined in terms of a contravariant basis (e.g. a basis of vector fields $\{E_i\}$) to the dual, covariant, basis of differential 1-forms $\{\varepsilon_i\}$, the values of the Christoffel symbols just change by a sign. In the event that you haven't really dealt with covariant and contravariant vectors much before, you can view vectors as column vectors from linear algebra, covectors as row vectors, and the transformation as a change of basis operation.

As for when $\nabla_b = \partial_b$, this is only guaranteed to happen when your metric is flat or when you have the connection acting on a $(0,0)$-form (a function) which you can view as either being neither covariant nor contravariant, so there are no Christoffel symbols, or you can view as being both covariant and contravariant so that the Christoffel symbols cancel out courtesy of the covariant-contravariant antisymmetry. I will take the former interpretation since it is a little easier to deal with.

Forgive me if I am wrong, but I think that the confusion here is coming from the fact that we are specifically considering the action of the composition $\nabla_a\nabla_b$ on a function and not just an individual $\nabla_k$, so let's take some time to break this composition down a little more.

The symbol $\nabla(f,b) = \nabla_b [f]$ is referring to a map that sends a vector field $b$ paired with a function $f$ to another vector field $\nabla_b$ which will act on the same function $f$. If $f$ was a covariant or contravariant tensor, then we would expect $\nabla_b$ to have Christoffel symbols that would dictate how this new vector field acts, however, since $f$ is neither covariant nor contravariant, we get no such Christoffel symbols, and the action is determined completely by just feeding the function to the vector field directional derivative-style. So, if we just forget the function $f$ for a moment, we can view $\nabla$ as a map that eats the coordinate vector field $\partial_b$ and spits out the vector field $\nabla_b$.

Now, when we consider the map $\nabla(\nabla_b,a) = \nabla_a(\nabla_b)$, we get somthing a bit different. In this case, the symbol is refering to a map that sends a vector field $\partial_a$ paired with another vector field $\nabla_b$ to a third vector field $\nabla_a\nabla_b = \nabla_a(\nabla_b)$. Since $\nabla_b$ is a vector field, if we were to write things out in terms of the definition, we get that $$\nabla_a(\nabla_b) = \sum_{k\in \{a,b\}}\left[\left([\partial_a][(\nabla_b)^k] - \sum_{i,j\in\{a,b\}}\left[(\partial_a)^i(\nabla_b)^j \Gamma^k_{ij}\right]\right)\partial_k\right]$$ where the superscripts are denoting the coordinate functions.

In the special case that we are looking at, we know that $(\partial_a)^i = 1$ if $i = a$ and $0$ if $i = b$, so our formula reduces to

$$\nabla_a(\nabla_b) = \sum_{k\in \{a,b\}}\left[\left( [\partial_a][(\nabla_b)^k] - \sum_{i,j\in\{a,b\}}(\nabla_b)^j \Gamma^k_{ij}\right)\partial_k\right].$$

Now we plug in our function, and since $f$ is a function that is neither covariant nor contravariant, we know that $\nabla_b$ acts on $f$ according to $\nabla_b[f] = \partial_b[f]$. This gives us

$$\begin{align}\nabla_a\nabla_b[f] &= \nabla_a(\nabla_b)[f]\\ &=\sum_{k\in \{a,b\}}\left[\left([\partial_a][(\partial_b)^k] - \sum_{i,j\in\{a,b\}}(\partial_b)^j \Gamma^k_{ij}\right)\partial_k\right][f]\\ &= \partial_a\partial_b[f] - \sum_{k\in \{a,b\}}\left[ \Gamma^k_{ab}\partial_k[f]\right] \end{align}$$

where we were able to reduce down to the final term because we know that $(\partial_b)^j = 1$ if $j=b$ and $0$ if $j=a$.

Let me know if you need any additional clarification!


References for Covariant Derivative

  1. Riemannian Manifolds (1997) by John Lee p. 50
  2. https://conf.math.illinois.edu/~kapovich/423-14/covariant.pdf
  3. https://en.wikipedia.org/wiki/Covariant_derivative#Functions (Wikipedia is actually great for a lot of general math things)
  4. https://www.youtube.com/watch?v=EFKBp52LtDM&t=0s followed by https://www.youtube.com/watch?v=cEEahoUUGyc
  5. https://graphics.stanford.edu/courses/cs468-13-spring/assets/lecture11-mildenhall.pdf
  6. Differential Geometry and Its Applications by John Oprea p. 82 (this is nice since he goes through everything very explicitly in terms of coordinates)