Calculus – Higher Derivatives of a Multivariate Function


I have recently realized that I am not sure what it means to consider $F''(x)$ and general higher derivatives for $F: \mathbb R ^n \rightarrow \mathbb R^m$. I am clear the at the first derivative is a matrix of partial derivatives $ \left( \begin{array}{ccc}
\frac{\partial f_1}{\partial x_1}(x) & … & \frac{\partial f_1}{\partial x_n}(x) \\
… & … & … \\
\frac{\partial f_m}{\partial x_1}(x) & … & \frac{\partial f_m}{\partial x_n}(x) \end{array} \right) $. But I am not sure how to differentiate this matrix. Do I want to treat it as a vector and then get the first derivative of the function $G: \mathbb R^n \rightarrow \mathbb R^{n \times m}$ which sends $x$ to the vector of partial derivatives?

Best Answer

To understand it, you have to treat derivatives as linear operators. If $f:\mathbb{R}^n\to\mathbb{R}^m$ then $$f':\mathbb{R}^n\to L(\mathbb{R}^n,\mathbb{R}^m)$$

where $ L(\mathbb{R}^n,\mathbb{R}^m)$ is the set of linear transformations from $\mathbb{R}^n$ to $\mathbb{R}^m$. It can be identified with $M_{n\times m}$ or $\mathbb{R}^n\times \mathbb{R}^m$. If you identify it with $\mathbb{R}^{n+m}$ you see that differentiate the matrix is the same as differentiate the function $f':\mathbb{R}^n\to\mathbb{R}^{n+m}$ and the last you know how to differentiate. Moreover, because $f'(x)$ is a linear transformation for all $x$, you have to understand how this transformation works and it works according with the formula $$f'(x)u=A_x u$$

where $A_x$ is the matrix of derivatives which is in your question.

To proceed we have that $$f'':\mathbb{R}^n\to L(\mathbb{R}^n, L(\mathbb{R}^n,\mathbb{R}^m))$$

Now $f''$ is a function which send $x$ to a linear transformation $f''(x)$ from $\mathbb{R}^n$ to $ L(\mathbb{R}^n,\mathbb{R}^m)$. But such linear operator can be identified with a bilinear form $g(x)$ by considering $g(x):\mathbb{R}^n\times \mathbb{R}^n\to\mathbb{R}^m$ defined by $$g(x)uv=f''(x)uv$$

Moreover, note that $$f''(x)uv=[f'(x)u]'v$$

hence $f''(x)$ is the bilinear form defined by $$f''(x)uv=[A_xu]'v$$

where $$[A_xu]'=\left( \begin{array}{ccc} \frac{\partial \sum_{i=1}^n\frac{\partial f_1}{\partial x_i}u_i}{\partial x_1} & ... & \frac{\partial \sum_{i=1}^n\frac{\partial f_1}{\partial x_i}u_i}{\partial x_n} \\ ... & ... & ... \\ \frac{\partial \sum_{i=1}^n\frac{\partial f_m}{\partial x_i}u_i}{\partial x_1} & ... &\frac{\partial \sum_{i=1}^n\frac{\partial f_m}{\partial x_i}u_i}{\partial x_n} \end{array} \right)$$

For example, consider the function $f(x,y)=(x^2-y,x+y^2)$. We can identify $$f'(x,y)= \left( \begin{array}{cc} 2x & -1 \\ 1 & 2y \end{array} \right) = (2x,1,-1,2y)$$

and we know that $$f'(x,y)(u,v)=\left( \begin{array}{cc} 2x & -1 \\ 1 & 2y \end{array} \right)\left( \begin{array}{c} u \\ v \end{array} \right)$$

Now, $f''(x,y)(u,v)(z,w)=[f'(x,y)(u,v)]'(z,w)$, but $$f'(x,y)(u,v)=(2xu-v,u+2yv)$$

Now we think on $(u,v)$ in the last expression as constants and $$f''(x,y)(u,v)=\left( \begin{array}{cc} 2u & 0 \\ 0 & 2v \end{array} \right)$$

which implies that $$f''(x,y)(u,v)(z,w)=(2uz,2vw)$$

The case $f^{(n)}$ is similar.

