[Math] Trying to understand “derivative or Jacobian of smooth map”

derivativesdifferential-geometry

From some lecture notes I am trying to puzzle through ….

"… the derivative or Jacobian of a smooth map $f: \mathbb{R}^m \rightarrow \mathbb{R}^n$ at a point $x$ is a linear map $Df: \mathbb{R}^m \rightarrow \mathbb{R}^n$. In terms of partial derivatives, $Df_x(X) = (\sum_j\partial_{x_j}f_1 \cdot X_j,
\sum_j \partial_{x_j}f_2\cdot X_j, …)$ … "

I'm so confused I'm not even sure where to begin. Well, first, shouldn't the derivative be a map $Df:\mathbb{R}^m\rightarrow \mathbb{R}^m\times\mathbb{R}^n$? Third, I am familiar with 3D integral calculus, and the only Jacobian I heard discussed there doesn't look like this at all, except, of course, that they both involve partial derivatibes. Also, I don't even know what $f_1 \cdot X_j$ means.

Thanks.

Best Answer

The best way to think about the derivative is: \begin{equation*} \tag{$\spadesuit$}f(x + \Delta x) \approx f(x) + f'(x) \Delta x. \end{equation*} The approximation is good when $\Delta x$ is small. This equation expresses the fact that $f$ is "locally linear" at $x$.

How can we make sense of ($\spadesuit$) when $f:\mathbb R^n \to \mathbb R^m$? \begin{equation*} f(\underbrace{x}_{n \times 1} + \underbrace{\Delta x}_{n \times 1}) \approx \underbrace{f(x)}_{m \times 1} + \underbrace{f'(x)}_{?} \underbrace{\Delta x}_{n \times 1}. \end{equation*}

It appears that $f'(x)$ should be something that, when multiplied by an $n \times 1$ column vector, returns an $m \times 1$ column vector. In other words, $f'(x)$ should be an $m \times n$ matrix.

If we prefer to think in terms of linear transformations rather than matrices, we can write \begin{equation*} f(x + \Delta x) \approx f(x) + Df(x) \Delta x. \end{equation*} Here $Df(x)$ is a linear transformation that takes $\Delta x$ as input, and returns $f'(x) \Delta x$ as output. This equation is what it means to be "locally linear" in the multivariable case.

Taking this as our starting point, it's not too hard to show that \begin{equation*} f'(x) = \begin{bmatrix} \frac{\partial f_1(x)}{\partial x_1} & \cdots & \frac{\partial f_1(x)}{\partial x_n} \\ \vdots & \ddots & \vdots \\ \frac{\partial f_m(x)}{\partial x_1} & \cdots & \frac{\partial f_m(x)}{\partial x_n} \end{bmatrix}. \end{equation*} (The functions $f_i$ are the component functions of $f$.)

If \begin{equation*} X = \begin{bmatrix} X_1 \\ \vdots \\ X_n \end{bmatrix}, \end{equation*} then \begin{equation*} f'(x) X = \begin{bmatrix} \sum_{j=1}^n \frac{\partial f_1(x)}{\partial x_j} X_j \\ \vdots \\ \sum_{j=1}^n \frac{\partial f_m(x)}{\partial x_j} X_j \end{bmatrix}, \end{equation*} as you can see just by doing the matrix-vector multiplication. This is the equation given in your question.