[Math] Is there some consensus on the dimensions of a Jacobian matrix and of a gradient

real-analysis

According to Wikipedia, given a differentiable mapping $F: \mathbb{R}^n \to \mathbb{R}^m$, its Jacobian matrix is a $m \times n$ matrix defined as:
$$
J_F=\begin{bmatrix} \dfrac{\partial y_1}{\partial x_1} & \cdots & \dfrac{\partial y_1}{\partial x_n} \\ \vdots & \ddots & \vdots \\ \dfrac{\partial y_m}{\partial x_1} & \cdots & \dfrac{\partial y_m}{\partial x_n} \end{bmatrix}.
$$
Specially when $m=1$, the Jacobian matrix is also called the gradient $\nabla F$.
So when trying to compute a differential, it is $J_F \Delta x$ or $\nabla F \Delta x$.

In real analysis, optimization, …, some texts agree with Wikipedia's definitions. However, in some others, a Jacobian matrix or a gradient of a differentiable mapping is defined to be the transpose of the Wikipedia definitions.

Moreover, in baby Rudin, $J_F$ is of $m \times n$ dimension, while when $m=1$, $\nabla F$ is of $n \times 1$ dimension.

When it comes to writing my own formulas, I wonder which way is mostly adopted?

Thanks and regards!

Best Answer

$\newcommand{\R}{\mathbf{R}}$Rudin's conventions are certainly the ones I use. He isn't contradicting himself, but it takes some work to see that. You've asked about bilinear forms and duality before, I think, so I hope the following makes sense.

If $F\colon \R^n \to \R$ is a smooth function and $x \in \R^n$, then $J_F$ is something which takes in a vector in the tangent space $T_x\R^n$, which is canonically identified with $\R^n$, to something in $T_{F(x)}\R = \R$. With Rudin's conventions, this should correspond to a $1 \times n$ matrix. We can view $J_F$ as an element of the cotangent space $T_x^\vee\R^n$ at $x$.

$\R^n$ comes with an inner product $\langle\phantom{x}, \phantom{x}\rangle$, the "dot product". What this allows us to do is identify $T_x\R^n$ with $T_x^\vee\R^n$. See the page on nondegenerate forms for more details. The upshot is that there has to be a tangent vector $\nabla F \in T_x\R^n$ such that $\langle \nabla F, v\rangle = J_F \cdot v$ for all $v \in T_x\R^n$. Since $\nabla F$ is a tangent vector, we express it as an $n \times 1$ matrix. You can check that the entries work out to be what Rudin says they are.

This causes more confusion later on in life, because a manifold may not come with a metric and then these identifications that one uses so freely in calculus cannot be made. Manifolds that do have an analogue of the dot product are called Riemannian.

Best Answer

Related Solutions

Real Analysis – Gradient as Row vs Column Vector

[Math] Need help understanding the concept of the Jacobian Matrix and its relation to differentiation

Related Question