[Math] Derivation of formula for gradient in spherical coordinates

calculusmultivariable-calculuspartial derivative

If we have a function $f=f(r, \theta, \phi)$, where $(r, \theta, \phi)$ are spherical coordinates on $\mathbb{R}^3$, how do we compute the gradient $\nabla f$ by using the formula
$$\nabla f \cdot d\vec{r} = df ?$$
Here $\vec{r}$ is the position vector and $df=\frac{\partial f}{\partial r}dr +\frac{\partial f}{\partial \theta}d\theta+\frac{\partial f}{\partial \phi}d\phi$.

Best Answer

You can use the total derivative concept such as $$df(r,\theta,\phi)=\frac{\partial f}{\partial r}dr+\frac{\partial f}{\partial \theta}d\theta+\frac{\partial f}{\partial \phi}d\phi$$

It basically shows you what will be the change in the function $f$ if you are at the point $(r_0,\theta_0,\phi_0)$ and increase one varible by incremental value of $dr$; $d\theta$; or $d\phi$.

Related Solutions

Gradient Definition for Non-Cartesian Coordinates – Multivariable Calculus

It turns out that there are two different but related notions of differentiation for a function $f:\mathbb R^n\to\mathbb R$: the total derivative $df$ and the gradient $\nabla f$.

The total derivative is a covector ("dual vector", "linear form") and does not depend on the choice of a metric ("measure of length").
The gradient is an ordinary vector and derived from the total derivative, but it depends on a metric. That why it looks a bit funny in different coordinate systems.

The definition of the total derivative answers the following question: given a vector $\vec v$, what is the slope of the function $f$ in the direction of $\vec v$? The answer is, of course

$$ df_{x}(\vec v) = \lim_{t\to0} \frac{f(x+t\vec v)-f(x)}{t}$$

I.e. you start at the point $x$ and walk a teensy bit in the direction of $\vec v$ and take note of the ratio $\Delta f/\Delta t$.

Note that the total derivative is a linear map $\mathbb R^n \to \mathbb R$, not a vector in $\mathbb R^n$. Given a vector, it tells you some number. In coordinates, this is usually written as

$$ df = \frac{\partial f}{\partial x}dx + \frac{\partial f}{\partial y}dy + \frac{\partial f}{\partial z}dz $$

where $dx,dy,dz$ are the total derivatives of the coordinate functions, for instance $dx(v_x,v_y,v_z) := v_x$. This formula looks the same in any coordinate system.

In contrast, the gradient answers the following question: what is the direction of the steepest ascend of the function? Which vector $\vec v$ of unit length maximizes the function $df(\vec v)$? As you can see, this definition crucially depends on the fact that you can measure the length of a vector. The gradient is then defined as

$$ \nabla f = df(\vec v_{max})\cdot\vec v_{max} $$

i.e. it gives both the direction and the magnitude of the steepest change.

This can also be expressed as

$$ \langle \nabla f, \vec v \rangle = df(\vec v) \quad\forall \vec v\in\mathbb R^n.$$

In other words, the scalar product $\langle,\rangle$ is used to convert a covector $df$ into a vector $\nabla f$. This also means that the formula for the gradient looks very different in coordinate systems other than cartesian. If the scalar product is changed (say, to $\langle\vec a,\vec b\rangle := a_xb_x + a_yb_y + 4a_zb_z$), then the direction of steepest ascend also changes. (Exercise: Why?)

[Math] Derivation of divergence in spherical coordinates from the divergence theorem

Here's a way of calculating the divergence.

First, some preliminaries. The first thing I'll do is calculate the partial derivative operators $\partial_x,\partial_y,\partial_z$ in terms of $\partial_r, \partial_\theta, \partial_\varphi$. To do this I'll use the chain rule. Take a function $v:\Bbb R^3\to\Bbb R$ and compose it with the function $g:\Bbb R^3\to\Bbb R^3$ that changes to spherical coordinates: $$g(r,\theta,\varphi) = (r\cos\theta\sin\varphi,r\sin\theta\sin\varphi,r\cos\varphi)$$ The result is $\tilde v(r,\theta,\varphi)=(v\circ g)(r,\theta,\varphi)$ i.e. "$v$ written in spherical coordinates". An abuse of notation is usually/almost-always commited here and we write $v(r,\theta,\varphi)$ to denote what is actually the new function $\tilde v$. I will use that notation myself now. Anyways, the chain rule states that $$\begin{pmatrix}\partial_x v & \partial_y v & \partial_z v\end{pmatrix} \begin{pmatrix} \cos\theta\sin\varphi &-r\sin\theta\sin\varphi & r\cos\theta\cos\varphi \\ \sin\theta\sin\varphi & r\cos\theta\sin\varphi &r\sin\theta\cos\varphi \\\cos\varphi & 0 & -r\sin\varphi\end{pmatrix} = \begin{pmatrix}\partial_r v & \partial_\theta v & \partial_\varphi v\end{pmatrix}$$ From this we get, for example (by inverting the matrix) that $$\partial_x = \cos\theta\sin\varphi\partial_r - \frac{\sin\theta}{r\sin\varphi}\partial_\theta + \frac{\cos\theta\cos\varphi}{r}\partial_\varphi$$ The rest will have similar expressions. Now that we know how to take partial derivatives of a real valued function whose argument is in spherical coords., we need to find out how to rewrite the value of a vector valued function in spherical coordinates. To be precise, the new basis vectors (which vary from point to point now) of $\Bbb R^3$ are found by differentiating the spherical parametrization w.r.t. its arguments (and normalizing). Thus (one example), $$\mathbf e_r = \frac{\partial_r g}{\|\partial_r g\|} = \frac{\begin{pmatrix} \cos\theta\sin\varphi & \sin\theta\sin\varphi & \cos\theta\end{pmatrix}}{\|\begin{pmatrix} \cos\theta\sin\varphi & \sin\theta\sin\varphi & \cos\theta\end{pmatrix}\|} = \begin{pmatrix} \cos\theta\sin\varphi & \sin\theta\sin\varphi & \cos\theta\end{pmatrix} \\[4ex] = \cos\theta\sin\varphi \mathbf i + \sin\theta\sin\varphi \mathbf j + \cos\theta\mathbf k$$ I don't know how to justify this without speaking of tangent spaces, but I'm sure you can ask your teacher for an explanation. After calculating the new unit vectors, you'll again have to invert the relation to obtain $\mathbf i,\mathbf j,\mathbf k$ in terms of $\mathbf e_r,\mathbf e_\theta,\mathbf e_\varphi$. But that part is just linear algebra!

Now that everything is set up, we can calculate the divergence. But what is the divergence? What I mean is, how do we write it as an abstract object that acts on functions? Here is one possibility, in terms as familiar as possible to a calculus student (there are other definitions too): $$\mathrm{div}(\cdot) = \partial_x\left(\langle \mathbf{i},\cdot\rangle\right) + \partial_y\left(\langle \mathbf{j},\cdot\rangle\right) + \partial_z\left(\langle \mathbf{k},\cdot\rangle\right)$$

Where the symbol $\langle\cdot,\cdot\rangle$ is used for the dot product. Try to convince yourself why the above formula is so.

Now just substitue all of the expressions we just derived for the basis vectors, and the differential operators. Finally, place an arbitrary vector field $$ \mathbf E = E_r(r,\theta,\varphi)\,\mathbf e_r + E_\theta(r,\theta,\varphi)\,\mathbf e_\theta + E_\varphi(r,\theta,\varphi)\,\mathbf e_\varphi$$ in place of the "$\cdot$" in the (new) expression for $\mathrm{div}$, and expand.

Best Answer

Related Solutions

Gradient Definition for Non-Cartesian Coordinates – Multivariable Calculus

[Math] Derivation of divergence in spherical coordinates from the divergence theorem

Related Question