Let $f:D \subseteq \Bbb R^2 \to R$ be a twice differentiable function. Then we'd call $f$ a surface.
The directional derivative $D_vf(x,y)$ is the derivative of $f$ at $(x,y)$ in the direction of the unit vector $v$. This gives the scalar slope at $(x,y)$ in that direction.
![enter image description here](https://i.stack.imgur.com/fBc2c.png)
The second directional derivative ${D_v}^2f(x,y)$ is the second derivative of $f$ at $(x,y)$ in the direction of the unit vector $v$. This gives the concavity at $(x,y)$ in that direction.
![enter image description here](https://i.stack.imgur.com/fql3U.gif)
The Hessian $Hf(x,y)$ at a point $(x,y)$ is really a bilinear form. That is, it's a function that takes two direction vectors and produces a number. If both of those direction vectors are the same, then it just gives the value of the second directional derivative (i.e. the concavity).
$$[Hf(x,y)](v,v) = {D_v}^2f(x,y)$$
The matrix that you're used to calling the Hessian is really just the matrix representation of this bilinear form with respect to an orthonormal basis.
$$\begin{bmatrix} f_{xx} & f_{xy} \\ f_{yx} & f_{yy}\end{bmatrix}$$
A bilinear form $B$ is positive-definite (resp. negative-definite) if $B(w,w)\gt 0$ (resp. $\lt 0$) for all $w\ne 0$. So then we can see that the concavity of $f$ at $(x,y)$ is positive (resp. negative) in all directions iff the Hessian $Hf(x,y)$ is positive-definite (resp negative-definite). This of course means that the surface has a minimum (resp. maximum) at that point.
OK. So where does the determinant come in? Well, checking every vector $w$ is kinda hard, so instead we can use properties of positive- and negative-definite bilinear forms. For instance, we can determine if a bilinear form is positive- and negative-definite by checking the eigenvalues of its matrix representation in any basis. If all of the eigenvalues are positive (resp. negative), then the bilinear form is positive-definite (resp. negative-definite).
Alternatively, we can use Sylvester's criterion, which says that equivalently, a Hermitian matrix is positive-definite if all of its principal minors are positive.
The principal minors of a matrix are the determinants of the following submatrices:
![enter image description here](https://i.stack.imgur.com/sdByG.png)
I.e. $f_{xx}>0$ and $f_{xx}f_{yy}-f_{xy}f_{yx}>0$ in the $2\times 2$ case.
Somewhat similarly, the condition for negative-definiteness is that the principal minors alternate with the first being negative. I.e. $f_{xx}<0$ and $f_{xx}f_{yy}-f_{xy}f_{yx}>0$ in the $2\times 2$ case.
A saddle point happens (sufficient but not necessary condition) when one of the eigenvalues of the matrix representing the Hessian is negative and the other positive. Geometrically this means that the concavity in one direction is negative and in some other direction is positive (those two directions being the directions of the eigenvectors). In that case the determinant will be negative. So the condition is $f_{xx}f_{yy}-f_{xy}f_{yx}<0$ in the $2\times 2$ case.
So that's where the multivariable second derivative test comes from: basic linear algebra.
Best Answer
"Diagonalisable" means that there is a linear change of variables so that the matrix is diagonal (obviously). It can also be shown that this change of variables can be effected by an orthogonal matrix. Hence, we have locally at the stationary point the expansion $$ f(x) = f(a) + \frac{1}{2!} (x-a)^T H (x-a) + o(\lvert x-a \rvert^2) = f(a) + y^T \Lambda y + o(\lvert y \rvert^2), $$ where $y=U(x-a)$ for an orthogonal matrix $U$ such that $ U^T H U = \Lambda$ is diagonal of the form $\operatorname{diag}(\lambda_1,\lambda_2,\dotsc,\lambda_n)$. It is clear now that locally the surface $z=2(f(x)-f(a))$ is close to the (diagonal) quadratic form $y^T \Lambda y$.
What does this actually mean? Let's look at the $n=2$ case for simplicity: higher dimensions have the same idea, but more complicated shapes.
So the eigenvectors of the Hessian form the principal axes of the paraboloid that acts as a local approximation of the behaviour, with the eigenvalues being related to the relative axis lengths.