Why do we derivate to find the unit vectors in a new coordinate system

coordinate systems

If I have a vector $\mathbf{u}=(x,y)$ in cartesian coordinates in a base with unit vectors $ \mathbf{\hat{i}}$ and $ \mathbf{\hat{j}}$, and want to find the unit vectors in a different coordinate system, say, polar coordinates, I know I can do so knowing that $x=r\cos(\theta)$ and $y=r\sin(\theta)$ and taking the derivatives of $\mathbf{u}$ with respect to $r$ and $\theta$. I was reading this page: http://www.physics.purdue.edu/~jones105/phys310/coordinates.pdf , where they say that " If a vector $\mathbf{r}$ depends on a parameter $u$, then a vector that points in the “direction” of
increasing $u$ is defined by $∂\mathbf{r}/∂u$ ".

In this thread Spherical Coordinates and Unit Vectors I also found a similar explanation, that by derivating we find the direction in which the new parameter increases, but I still can't quite "visualize" why we have to take the derivatives with relation to the new parameters…

Best Answer

Let me try and answer the question this way. We first discuss what a "unit vector" really is and what it represents. This choice of how to think about unit vectors should directly lead us to the idea that the correct unit vectors are those defined by taking the derivative along those directions.

I first claim that "unit vectors" do not exist in general. Rather, we have "unit vectors at each point of the plane". What these unit vectors do is to tell us in what directions the coordinates increase.

I'd like to focus on the 2D case, where we have access to three coordinate systems, one being the standard cartesian coordinate system, one being polar, and the other being a weird coordinate system that I'm going to define, which is "different enough" to get the point across.

Let's say our space is $M = \mathbb R^2$. Now, by a coordinate system, I mean a function $(c: M \rightarrow \mathbb R^2)$ such that $c$ is differentiable and is a bijection between $M$ and $\mathbb R^2$. Note that I'm referring to our space explicity as $M$ and not as $\mathbb R^2$, since I want us to think of 2D space as a "collection of points", and not as numbers.

We have the usual cartesian coordinate system:

$$ E: M \rightarrow \mathbb R^2 \qquad C(x, y) \equiv (x, y) \\ C_1(x, y) = x \qquad C_2(x, y) = y $$.

If we now compute the gradient of $C_1$ (the direction along which $C_1$ increases), we get:

$$ \nabla C_1 \equiv \left( \frac{\partial C_1}{\partial x}, \frac{\partial C_2}{\partial y} \right) = ( \frac{\partial x}{\partial x}, \frac{\partial y}{\partial x}) = (1, 0) $$

which is indeed the usual unit vector, we denote by $\hat x$. A similar computation will show us that $\nabla C_2 = (0, 1) = \hat y$.

So, I'm going to claim that this is the right way to think of unit vectors: as directions along which the coordinate increases.

An "exotic" coordinate system on $\mathbb R^2$

$$ E: M \rightarrow \mathbb R^2 \qquad E(x, y) = (x-y, x+y) \\ E_1(x, y) = x-y \qquad E_2(x, y) = x + y $$

We can get the sense of what a coordinate system is doing by plotting the lines of constant coordinate. So, I'm going to plot $E_1(x, y) = \{1, 2, 3, 4, 5\}$, and similarly $E_2(x, y) = \{1, 2, 3, 4, 5\}$:

The red lines are equations of the form $x + y = 1$, $x + y = 2$, etc. While the blue lines are equations of the form $x - y = 1$, $x - y = 2$, etc.

If we compute the gradient of $E_1$, this will give us the $E_1$ unit vector that points in the direction of increase of $E_1$: $$ \nabla E_1 \equiv \left( \frac{\partial E_1}{\partial x}, \frac{\partial E_1}{\partial y} \right) = \left( \frac{\partial x-y}{\partial x}, \frac{\partial x-y}{\partial y} \right) = (1, -1) = \hat E_1 $$

We can similarly compute $E_2$ to recieve $\hat E_2 = (1, 1)$. Notice that $\hat E_1$, $\hat E_2$ are perpendicular, just as $\hat x$ and $\hat y$ are perpendicular.

Polar coordinates on $\mathbb R^2$

We can go on to do the same thing, to find the gradients of the polar coordinates:

$$ P: M \rightarrow \mathbb R^2 \qquad P (x, y) \equiv \left(\sqrt {x^2 + y^2}, \tan^{-1} y/x \right) \\ P_1(x, y) = \sqrt{x^2+y^2} \qquad P_2(x, y) \equiv \tan^{-1}(y/x) $$

I leave computing $\nabla P_1, \nabla P_2$ as an exercise.

I hope this intuitively motivates why the gradient is our unit vector.

Related Question