In vector calculus, the nabla symbol $\nabla$ is used to denote three different operations:
- the gradient of a scalar function $f$ is vector field: $\mathrm{grad}(f)=\nabla f$
- the divergence of a vector field $\mathbf{F}$ is the scalar function: $\mathrm{div}(\mathbf{F})=\nabla\cdot\mathbf{F}$
Furthermore,
- the Laplacian of a scalar function $f$ is a scalar function: $\mathrm{lap}(f)=\mathrm{div}(\mathrm{grad}(f))=\nabla\cdot(\nabla f)=\nabla^2f=\Delta f$
- the gradient of a vector field $\mathbf{F}$ is the 2nd order tensor field: $\mathrm{grad}(\mathbf{F})=(\nabla\mathbf{F})^T=\mathbf{J}(\mathbf{F})$, where $\mathbf{J}(\mathbf{F})$ is the Jacobian of $\mathbf{F}$.
These notations are taken from the Wikipedia article on vector calculus identities. Now, the Wikipedia article on the Hessian matrix states, that
- the Hessian $\mathbf{H}(f)$ of a scalar function is the Jacobian of its gradient, i.e. $\mathbf{H}(f)=\mathbf{J}(\nabla f)$
Therefore, the Hessian in vector calculus notation would be:
- $\mathbf{H}(f)=(\nabla(\nabla f))^\top=\nabla(\nabla f)$, since the Hessian is always symmetric (for continuous functions).
However, this is a bit inconsistent. All of the above identities could be derived with $\nabla$ being identified as a column vector of partial derivative operators. Thus, $\nabla f$ produces a column vector. However, then $\nabla$ can (for the sake of consistency) only be applied to a row vector field $\mathbf{F}$, giving matrices (a column vector times a row vector implies the outer product and thus produces a matrix). Therefore, one should write the Hessian as
- $\mathbf{H}(f)=\nabla(\nabla f)^\top$
Which of these alternatives is better/correct? Probably there is some freedom, since we can also define the gradient operator to act on a row vector, while it also produces row vectors for a scalar function. Yet, this is somewhat inelegant.
Best Answer
The most elegant way is probably to write the Hessian explicitly as an outer product:
$\mathbf{H}(f)=\nabla\otimes\nabla f$
This avoids any ambiguity with the Laplacian as well as clumsy notations with the transpose and does not depend on $\nabla f$ being defined as a column vector.
In this case, the gradient of a vector field should everywhere be rewritten as $\nabla\otimes\mathbf{F}$.