[Math] the difference between the derivative (the Jacobian), and the differential

calculusdifferential-geometry

Let $f:M \subset \mathbb{R}^2 \rightarrow N \subset \mathbb{R}^3$.

  • The function $f$ is a vector function.
  • Its differential $\mathrm{d}f \in \mathbb{R}^3$ represents the infinitesimal change in the function, where by $\mathrm{d}f$, I mean $\mathrm{d}f(x)$.
  • Its Jacobian (matrix) $J \in \mathbb{R}^{3 \times 2}$ maps vectors between tangent spaces $T_x M$ and $T_{f(x)} N$.

The relation between the two is $\mathrm{d}f = J dx$, where $\mathrm{d}x \in \mathbb{R}^2$.

However, if $f$ is considered a "mapping", then is the differential of the mapping $\mathrm{d}f$ equal to the Jacobian $J$?


From some of the answers, it seems that I took some things for granted (common knowledge or agreed by all). Moreover, there seems to be a confusion between differential, derivative, and their notation.

So first, let's agree that the differential (total derivative) and the derivative (Jacobian) are not the same thing:

Next, as per Wikipedia, let's agree on notation. Each of $f'(x)$, $D f(x)$, and $\frac{\mathrm{d} f}{\mathrm{d} x}$, and $J$ refers to the derivative. The notation $\mathrm{d}f$ is reserved to denote the differential.

Now, back to my question.

  • The derivative of $f$ is the Jacobian matrix $f'(x)=Df=J \in \mathbb{R}^{3 \times 2}$.

  • The differential of $f$ is the 3D vector $\mathrm{d}f = J \mathrm{d}x$.

For some reason, there are people who confusingly use the term "differential of a mapping" to refer to the derivative, as if they don't distinguish between the derivative and the differential:

My question is: What's up with that, and what am I missing?

Why is that important: for a long time, I wasn't clear about what exactly the differential is. It became an issue when I used matrix calculus to calculate the Hessian of a matrix function. The book Matrix Differential Calculus with Applications in Statistics and Econometrics cleared it all up for me. It properly and distinctively defines the Jacobian, gradient, Hessian, derivative, and differential. The distinction between the Jacobian and differential is crucial for the matrix function differentiation process and the identification of the Jacobian (e.g. the first identification table in the book).

At this point, I am mildly annoyed (with myself) that previously I wrote things (which are too late to fix now) and blindly (relying on previous work) used the term "differential of a mapping". So, currently, I either look for some justification for this misnomer or otherwise suggest to the community to reconsider it.


I tried to track down the culprit for this "weird fashion", and I went as far as the differential geometry bible. Looking at do Carmo, definition 1 in chapter 2 appendix, pg. 128 (pg. 127 in the first edition), the definition of $dF_p$ is fine (grammar aside): it's a linear map that is associated with each point in the domain.

But then, in example 10 (pg. 130), he uses the same notation to denote both Jacobian and differential. (This is probably what Ulrich meant by almost the same thing.)
More specifically, he "applies it twice": once to get the Jacobian and once to get the differential. He uses $df(\cdot)$ to denote the Jacobian, a non-linear map into a matrix target, and $df_{(\cdot)}(\cdot)$ to denote the differential, a linear map into a vector target, and he calls both a differential.


Another point why I find it confusing is that for me the Jacobian is a matrix of partial derivatives and the differential is an operator. For example, to differentiate the matrix function $f:\mathbb{R}^{2 \times 2} \rightarrow \mathbb{R}$:

$f(X) = tr AX$

I would use the differential operator:

$df(X; dX) = tr AdX$

And from the Jacobian identification table (Magnus19), I'll get:

$Df(X) = A'$

Note that the differential isn't a trivial linear map anymore.

It also leads to another point. The differential has a linear approximation meaning. Basically, it denotes the change in the function. If it's a scalar value function, the change would be scalar, and thus the differential (would map to a scalar). If the domain is matrices, then the Jacobian is a matrix (a non-linear map from matrices to matrices). I definitely would find it confusing if someone would treat them the same.


Let's do another example, $f:\mathbb{R}^{2 \times 2} \rightarrow \mathbb{R}^{2 \times 2}$:

$f(X) = AX$

Using the differential operator:

$df(X; dX) = AdX$

$vec\ df(X; dX) = (I_2 \otimes A) vec\ dX$

From the Jacobian identification table:

$Df(X) = I_2 \otimes A$

In this case, I'm not sure I'd consider the differential $df$ and Jacobian $Df$ almost the same thing (I'm not so good with tensors). This is the root of my issue. It's not always a simple matrix multiplication, and one needs to be mindful about the difference between the differential and Jacobian.

Not to mention the second order differential and the Hessian identification.


I corresponded with a couple of Caltech guys who settled it for me, and I can live with that. To paraphrase:

Math is a living language like any other, it evolves and changes. As long as we clearly define the terms in the context, there shouldn't be a problem–call it whatever you want.

Best Answer

If $M\subset \mathbb{R}^n$ is an open set and $f:M\to \mathbb{R}^k$ is differentiable, then for $p\in M$ we have the derivative $d_pf:\mathbb{R}^n\to\mathbb{R}^k$, a linear map. In your situation it is not necessary (but certainly possible) to think about tangent spaces. The matrix that describes the linear map $d_pf$ with respect to the standard bases of $\mathbb{R}^n$ and $\mathbb{R}^k$ I would denote by $f'(p)$ (you call it $J$). So it is just the matter of applying a linear map to a vector versus multiplying this vector by a matrix: $$d_pf(Y)=f'(p).Y$$ Almost the same thing...