[Math] What kind of object is the second derivative of a function $f:\mathbb R^n\to\mathbb R^m$

calculusderivativesmultivariable-calculusreal-analysisvector analysis

I wonder what is the meaning of the second derivative or what kind of object it is when we have a function $f:\mathbb{R}^n \rightarrow \mathbb{R}^m$.

The first derivative is the Jacobian matrix, but then, what is the second derivative? How can I treat them when I write $f''$ or $D^2 f$?

Thanks a lot for your help!

Best Answer

The Jacobian matrix is the best linear approximation to $f$ at a particular point. However, if you change the point, you get a different Jacobian. The second derivative quantifies how the Jacobian changes as the point of approximation changes - the "change of the change".

To that end, we can think of the derivative $D$ as a mapping from the domain of the original function to the space of linear maps $\mathcal{L}(X,Y)$ with domain $X$ and range $Y$ the same as the original function: \begin{align} f:& X \rightarrow Y \\ Df:& X \rightarrow \mathcal{L}(X,Y) \end{align} The derivative of $f$, $Df$, is a function where you put in a point and it gives you a linear function, $$Df(x_0) = \text{best linear function approximating $f$ near }x_0.$$

In matrix form, $Df(x_0)$ is the Jacobian matrix $J$ at $x_0$: $Df(x_0)(y) = J|_{x_0} y$.

Since $Df$ is itself a function, we can take it's derivative, and so on, getting a tower of higher and higher derivatives as follows:

\begin{align} f:&\mathbb{R}^n \rightarrow \mathbb{R}^m \\ Df:&\mathbb{R}^n \rightarrow \mathcal{L}(\mathbb{R}^n, \mathbb{R}^m) \\ D(Df) = D^2f:&\mathbb{R}^n \rightarrow \mathcal{L}(\mathbb{R}^n, \mathcal{L}(\mathbb{R}^n, \mathbb{R}^m)) \\ D(D^2f) = D^3f:&\mathbb{R}^n \rightarrow \mathcal{L}(\mathbb{R}^n,\mathcal{L}(\mathbb{R}^n, \mathcal{L}(\mathbb{R}^n, \mathbb{R}^m))) \\ \dots \end{align}

Now this gets confusing fast (spaces of linear maps mapping to spaces of linear maps mapping to... ack!!). Luckily there is an isometric isomorphism theorem saying that everything just boils down to multilinear maps: $$\mathcal{L}^n(X,\mathcal{L}^m(X,Y)) \cong \mathcal{L}^{n+m}(X,Y),$$ where $\mathcal{L}^k(X,Y)$ is the space of $k$-linear maps from $X$ to $Y$, and $\cong$ denotes an isometric isomorphism of function spaces. In more detail, what it means for $g$ to be in $\mathcal{L}^k(X,Y)$ is that $g : X \times \dots \times X \rightarrow Y$, and $g$ is independently linear in each of it's entries: $$g(x_a + x_b,z,w) = g(x_a,z,w) + g(x_b,z,w),$$ $$g(x,y_a+y_b,w) = g(x,y_a,w) + g(x,y_b,w),$$ and so on.

So, now we can simplify our tower of derivatives using spaces of multilinear functions: \begin{align} f:&\mathbb{R}^n \rightarrow \mathbb{R}^m \\ Df:&\mathbb{R}^n \rightarrow \mathcal{L}(\mathbb{R}^n, \mathbb{R}^m) \\ D^2f:&\mathbb{R}^n \rightarrow \mathcal{L}^2(\mathbb{R}^n, \mathbb{R}^m) \\ D^3f:&\mathbb{R}^n \rightarrow \mathcal{L}^3(\mathbb{R}^n, \mathbb{R}^m) \\ \dots \end{align}

So, from this picture it is pretty clear what the second derivative of your function $f:X \rightarrow Y$ is at a point. It is a bilinear map from $X \times X$ to $Y$. You put in two vectors from $X$, and it gives out a vector in $Y$, and does so in a way that is linear in each input independently.


If you have a basis $\{ b_i\}$ of $n$ vectors for $X$ and basis $\{e_i\}$ of $m$ vectors for $Y$, you could completely characterize the second derivative by a 3D $n$-by-$n$-by-$m$ array of numbers $T_{ijk}$ where the $(i,j,k)$'th entry is found by applying the bilinear function with $b_i$ in the first argument and $b_j$ in the second argument, and then taking the component of the vector you get out in the $e_k$ direction: $$T_{ijk} = e_k^T D^2f(x_0)(b_i,b_j).$$