[Math] Is there some consensus on the dimensions of a Jacobian matrix and of a gradient

real-analysis

According to Wikipedia, given a differentiable mapping $F: \mathbb{R}^n \to \mathbb{R}^m$, its Jacobian matrix is a $m \times n$ matrix defined as:
$$
J_F=\begin{bmatrix} \dfrac{\partial y_1}{\partial x_1} & \cdots & \dfrac{\partial y_1}{\partial x_n} \\ \vdots & \ddots & \vdots \\ \dfrac{\partial y_m}{\partial x_1} & \cdots & \dfrac{\partial y_m}{\partial x_n} \end{bmatrix}.
$$
Specially when $m=1$, the Jacobian matrix is also called the gradient $\nabla F$.
So when trying to compute a differential, it is $J_F \Delta x$ or $\nabla F \Delta x$.

In real analysis, optimization, …, some texts agree with Wikipedia's definitions. However, in some others, a Jacobian matrix or a gradient of a differentiable mapping is defined to be the transpose of the Wikipedia definitions.

Moreover, in baby Rudin, $J_F$ is of $m \times n$ dimension, while when $m=1$, $\nabla F$ is of $n \times 1$ dimension.

When it comes to writing my own formulas, I wonder which way is mostly adopted?

Thanks and regards!

Best Answer

$\newcommand{\R}{\mathbf{R}}$Rudin's conventions are certainly the ones I use. He isn't contradicting himself, but it takes some work to see that. You've asked about bilinear forms and duality before, I think, so I hope the following makes sense.

If $F\colon \R^n \to \R$ is a smooth function and $x \in \R^n$, then $J_F$ is something which takes in a vector in the tangent space $T_x\R^n$, which is canonically identified with $\R^n$, to something in $T_{F(x)}\R = \R$. With Rudin's conventions, this should correspond to a $1 \times n$ matrix. We can view $J_F$ as an element of the cotangent space $T_x^\vee\R^n$ at $x$.

$\R^n$ comes with an inner product $\langle\phantom{x}, \phantom{x}\rangle$, the "dot product". What this allows us to do is identify $T_x\R^n$ with $T_x^\vee\R^n$. See the page on nondegenerate forms for more details. The upshot is that there has to be a tangent vector $\nabla F \in T_x\R^n$ such that $\langle \nabla F, v\rangle = J_F \cdot v$ for all $v \in T_x\R^n$. Since $\nabla F$ is a tangent vector, we express it as an $n \times 1$ matrix. You can check that the entries work out to be what Rudin says they are.

This causes more confusion later on in life, because a manifold may not come with a metric and then these identifications that one uses so freely in calculus cannot be made. Manifolds that do have an analogue of the dot product are called Riemannian.