Can the gradient operator $\nabla_\mathbf{x}$ be treated as a standalone vector

multivariable-calculus

Let $f:\mathbb{R}^n\rightarrow\mathbb{R}$ be a scalar function. Then, the gradient of $f(\mathbf{x})$ is defined as:

$$
\nabla_\mathbf{x} f(\mathbf{x}) =
\begin{bmatrix} \frac{\partial f(\mathbf{x})}{\partial x_1} \\
\frac{\partial f(\mathbf{x})}{\partial x_2} \\
\vdots \\
\frac{\partial f(\mathbf{x})}{\partial x_n}
\end{bmatrix}
$$

However, if $f(\mathbf{x})$ is a scalar, then wouldn't the following also be valid?

$$
(\nabla_\mathbf{x}) (f(\mathbf{x})) =
\begin{bmatrix} \frac{\partial}{\partial x_1} \\
\frac{\partial}{\partial x_2} \\
\vdots \\
\frac{\partial}{\partial x_n}
\end{bmatrix} f(\mathbf{x}) =
\begin{bmatrix} \frac{\partial f(\mathbf{x})}{\partial x_1} \\
\frac{\partial f(\mathbf{x})}{\partial x_2} \\
\vdots \\
\frac{\partial f(\mathbf{x})}{\partial x_n}
\end{bmatrix}
$$

In other words, the gradient $\nabla_\mathbf{x}$ is treated as a standalone vector, and then it is multiplied by the scalar $f(\mathbf{x})$. Interestingly, if this is indeed true, then I can build the Hessian matrix of $f(\mathbf{x})$ using the following outer product:

$$
(\nabla_x \nabla_x^T) (f(\mathbf{x}))
$$

In other words, I left-multiply the row vector $\nabla_x^T$ by the column vector $\nabla_x$ to get an $n \times n$ matrix of second-order partial derivative operators. I then multiply this matrix by the scalar $f(\mathbf{x})$ to get the Hessian matrix. However, I am wondering if this is just a coincidence.

Best Answer

This notation is often formally used, notably in the definition of the Laplace operator. We can make it explicit though.

For convenience, lets only work with functions that are $C^\infty$, such that all their partial derivatives exist and can again be partially differentiated and so on (think about the issues that happen when we don't do this).

A partial derivative is just a function (operator) that sends functions to functions: $$\frac{d}{dx_i}: C^\infty(\mathbb{R}^n, \mathbb{R}) \to C^\infty(\mathbb{R}^n, \mathbb{R})$$ We can regard the gradient as a $1 \times n$ matrix with such operators as coefficients, so the gradient is then an element of the set $C^\infty(\mathbb{R}^n, \mathbb{R})^{1\times n}$. This gradient, and any other matrix of operators, is then itself an operator that returns a (real) matrix, by applying the operators pointwise to a function.

To be able to multiply matrices with coefficients in $C^\infty(\mathbb{R}^n, \mathbb{R})$, we need to be able to multiply the coefficients with each other. The multiplication of operators can be defined as composition. In this setting we indeed have $$ \nabla^T \nabla = L$$ $$ \nabla \nabla^T = H$$ where $L$ is the Laplace operator and $H$ is the Hessian operator.