Which is the correct vector calculus notation for the Hessian

hessian-matrixvector analysis

In vector calculus, the nabla symbol $\nabla$ is used to denote three different operations:

  • the gradient of a scalar function $f$ is vector field: $\mathrm{grad}(f)=\nabla f$
  • the divergence of a vector field $\mathbf{F}$ is the scalar function: $\mathrm{div}(\mathbf{F})=\nabla\cdot\mathbf{F}$

Furthermore,

  • the Laplacian of a scalar function $f$ is a scalar function: $\mathrm{lap}(f)=\mathrm{div}(\mathrm{grad}(f))=\nabla\cdot(\nabla f)=\nabla^2f=\Delta f$
  • the gradient of a vector field $\mathbf{F}$ is the 2nd order tensor field: $\mathrm{grad}(\mathbf{F})=(\nabla\mathbf{F})^T=\mathbf{J}(\mathbf{F})$, where $\mathbf{J}(\mathbf{F})$ is the Jacobian of $\mathbf{F}$.

These notations are taken from the Wikipedia article on vector calculus identities. Now, the Wikipedia article on the Hessian matrix states, that

  • the Hessian $\mathbf{H}(f)$ of a scalar function is the Jacobian of its gradient, i.e. $\mathbf{H}(f)=\mathbf{J}(\nabla f)$

Therefore, the Hessian in vector calculus notation would be:

  • $\mathbf{H}(f)=(\nabla(\nabla f))^\top=\nabla(\nabla f)$, since the Hessian is always symmetric (for continuous functions).

However, this is a bit inconsistent. All of the above identities could be derived with $\nabla$ being identified as a column vector of partial derivative operators. Thus, $\nabla f$ produces a column vector. However, then $\nabla$ can (for the sake of consistency) only be applied to a row vector field $\mathbf{F}$, giving matrices (a column vector times a row vector implies the outer product and thus produces a matrix). Therefore, one should write the Hessian as

  • $\mathbf{H}(f)=\nabla(\nabla f)^\top$

Which of these alternatives is better/correct? Probably there is some freedom, since we can also define the gradient operator to act on a row vector, while it also produces row vectors for a scalar function. Yet, this is somewhat inelegant.

Best Answer

The most elegant way is probably to write the Hessian explicitly as an outer product:

$\mathbf{H}(f)=\nabla\otimes\nabla f$

This avoids any ambiguity with the Laplacian as well as clumsy notations with the transpose and does not depend on $\nabla f$ being defined as a column vector.

In this case, the gradient of a vector field should everywhere be rewritten as $\nabla\otimes\mathbf{F}$.