How is the notation in this Hessian Matrix calculated

I'm sorry I'm not sure how to word this question. I've returned to school after a long break where I was working full time. In school I did calculus, but most of it seems to have left me.

I have started schooling again in data science and I am trying to compute a Hessian Matrix for a simple function. The function is:

$$f(x_1, x_2) = (x_1-1)^2 + 100(x_1^2-x_2)^2$$

I have calculated the gradient vector by taking the first order derivative

$$\nabla f(x_1, x_2) = \begin{bmatrix}
2(x_1-1) + 400x_1(x_1^2-x_2) \\
-200(x_1^2-x_2)
\end{bmatrix}$$

In attempting to calculate the Hessian Matrix I am confused by the notation of entry 1,2 and 2,1:

$$ \nabla^2f(x_1, x_2) = \begin{bmatrix}
\frac{\partial^2f}{\partial x_1^2} & \frac{\partial^2f}{\partial x_2 \partial x_1}\\
\frac{\partial^2f}{\partial x_1 \partial x_2} & \frac{\partial^2f}{\partial x_2^2} \\
\end{bmatrix}$$

For entry (1,1) and (2,2), I just retake the derivative of the above gradient vector

$$ \frac{\partial^2f}{\partial x_1^2} = \frac{\partial f}{\partial x_1} [2(x_1-1) + 400x_1(x_1^2-x_2)] = 1200x^2-400x_2+2 $$

and

$$ \frac{\partial^2f}{\partial x_2^2} = \frac{\partial f}{\partial x_2} [ -200x_1^2+200x_2 ] = 200 $$

Therefore the matrix as it stands is:

$$ \nabla^2f(x_1, x_2) = \begin{bmatrix}
1200x^2-400x_2+2 & \frac{\partial^2f}{\partial x_2 \partial x_1}\\
\frac{\partial^2f}{\partial x_1 \partial x_2} & 200 \\
\end{bmatrix}$$

How would I go about calculating $$ \frac{\partial^2f}{\partial x_2 \partial x_1} and \frac{\partial^2f}{\partial x_1 \partial x_2} $$?

Thank you for your time

Edit: Update \Delta to \nabla and \delta to \partial as suggested by top answer.

Best Answer

The mixed derivatives can also be obtained by taking the derivative of entries of the gradient.

$$\frac{\partial^2 f}{\partial x_2 \ \partial x_1} = \frac{\partial}{\partial x_2} \frac{\partial f}{\partial x_1} = \frac{\partial}{\partial x_1} [2(x_1-1) + 400 x_1(x_1^2-x_2)] = -400x_1$$

$$\frac{\partial^2 f}{\partial x_1 \ \partial x_2} = \frac{\partial}{\partial x_1} \frac{\partial f}{\partial x_2} = \frac{\partial}{\partial x_2} [-200(x_1^2-x_2)] = -400x_1$$

Note that the order you take the derivatives does not matter if the function has continuous second partial derivatives.

Latex comments: generally the gradient and Hessian are denoted with $\nabla$ and $\nabla^2$ respectively (\nabla), not $\Delta$ (\Delta). Also, the symbol for partial derivatives is usually $\partial$ (\partial) rather than $\delta$ (\delta).

Best Answer

Related Solutions

[Math] Hessian matrix as derivative of gradient

[Math] Is the Hessian Equal to the Outer Product of the Score with Itself

Related Question