[Math] Definition of gradient of a function $f$ in Riemannian manifold

definitionriemannian-geometrysemi-riemannian-geometry

I'm reading Semi Riemannian Geometry with applications to relativity by Barret Oneill and I'm trying understand the definition of gradient of a function $f$ in Riemannian Manifold. I know that motivation for define the gradient of a function $f$ in Riemannian Geometry is preserve the fact that $ \langle grad \ f , X \rangle = df(X)$ in $\mathbb{R}^n$, where $X$ is a vector field. On the one hand, $df(X) = \sum_{i=1}^{i=n} \frac{\partial f}{\partial x^i} X^i$. On the other hand, $$\langle grad \ f , X \rangle = \langle \sum_{i=1}^{i=n} (grad \ f)^i \frac{\partial }{\partial x^i} , \sum_{j=1}^{j=n} X^j \frac{\partial }{\partial x^j} \rangle = \sum_{j=1}^{j=n} \sum_{i=1}^{i=n} (grad \ f)^i \ X^j \langle \frac{\partial }{\partial x^i} , \frac{\partial }{\partial x^j} \rangle = \sum_{j=1}^{j=n} \sum_{i=1}^{i=n} (grad \ f)^i \ X^j g_{ij}.$$ If $ \langle grad \ f , X \rangle = df(X)$, then $\sum_{i=1}^{i=n} \frac{\partial f}{\partial x^i} X^i = \sum_{j=1}^{j=n} \sum_{i=1}^{i=n} (grad \ f)^i \ X^j g_{ij} \ (*)$, but the author affirms that $$grad \ f := \sum_{j=1}^{j=n} \sum_{i=1}^{i=n} g^{ij} \frac{\partial f}{\partial x^i} \frac{\partial }{\partial x^j}.$$ I know that $g^{ij}$ represent the element of the matrix $G^{-1}$, where $G$ is the matrix of tensor metric, but I don't understand how the conclude that $grad \ f$ is this.

Thank you in advance for any help!

EDIT:

I tried develop the equation $(*)$ and I thought that I got it how $grad \ f$ is defined. I will put my development here.

$\sum_{i=1}^{i=n} \frac{\partial f}{\partial x^i} X^i = \sum_{j=1}^{j=n} \sum_{i=1}^{i=n} (grad \ f)^i \ X^j g_{ij} \Longrightarrow$

$\sum_{i=j}^{j=n} \frac{\partial f}{\partial x^j} X^j = \sum_{i=1}^{i=n} (grad \ f)^i \left(
\sum_{j=1}^{j=n} (g_{ij} X^j) \right)$

In matricial form, we have

$[\frac{\partial f}{\partial x^1} \cdots \frac{\partial f}{\partial x^n}] \cdot [X^1 \cdots X^n]^T = [(grad \ f)^1 \cdots (grad \ f)^n] \cdot G \cdot [X^1 \cdots X^n]^T$, where $[X^1 \cdots X^n]^T$ is the transpose of matrix $[X^1 \cdots X^n]$, then

$[\frac{\partial f}{\partial x^1} \cdots \frac{\partial f}{\partial x^n}] \cdot [X^1 \cdots X^n]^T – [(grad \ f)^1 \cdots (grad \ f)^n] \cdot G \cdot [X^1 \cdots X^n]^T = 0 \Longrightarrow$

$\left( [\frac{\partial f}{\partial x^1} \cdots \frac{\partial f}{\partial x^n}] – [(grad \ f)^1 \cdots (grad \ f)^n] \cdot G \right) \cdot [X^1 \cdots X^n]^T = 0 \Longrightarrow$

$[(grad \ f)^1 \cdots (grad \ f)^n] \cdot G = [\frac{\partial f}{\partial x^1} \cdots \frac{\partial f}{\partial x^n}] \Longrightarrow$

$[(grad \ f)^1 \cdots (grad \ f)^n] = [\frac{\partial f}{\partial x^1} \cdots \frac{\partial f}{\partial x^n}] \cdot G^{-1} \Longrightarrow$

$(grad \ f)^j = \sum_{i=1}^{i=n} \frac{\partial f}{\partial x^i} g^{ij}$. We can $grad \ f = \sum_{j=1}^{j=n} (grad \ f)^j \frac{\partial }{\partial x^j}$, then $grad \ f = \sum_{j=1}^{j=n} \left( \sum_{i=1}^{i=n} \frac{\partial f}{\partial x^i} g^{ij} \right) \frac{\partial }{\partial x^j} = \sum_{j=1}^{j=n} \sum_{i=1}^{i=n} \frac{\partial f}{\partial x^i} g^{ij} \frac{\partial }{\partial x^j}$

Best Answer

I like to think of this in terms of matrix algebra. Let's let our vector fields and one-forms be column vectors, where the one-forms act by transpose-multiplication.

In a coordinate chart, an arbitrary vector field $V$ pulls back to a vector field on our coordinate patch in $\mathbb{R}^n$, the metric tensor pulls back to a matrix field $g$, with inverse matrix field $g^{-1}$, and the differential of $f$ is a one-form $df$. The computation defining the gradient $\nabla f$ is: $$ \langle \nabla f, V\rangle = df(V) $$ which becomes in coordinates: $$ (\nabla f)^Tg V = df^T V $$ As this is true for any $V$, we have the matrix identity $$ \nabla f^T g = df^T $$ which gives $$ g^T\nabla f = df $$ and since $g$ is a symmetric matrix, inverting we have $$ \nabla f = g^{-1}df $$

If you write it out with sums and coordinate vector fields, you will be performing this computation at the level of matrix entries, but I find this approach much cleaner.