When the Hessian matrix is indefinite, why does the point have to be a saddle point

hessian-matrixmultivariable-calculus

Simply a question that occurred to me and proof which I can't seem to find. I realise that if the Hessian matrix is indefinite, it's determinant is less that zero but how does that mean that the point is a saddle point?

Best Answer

I presume we're talking about a twice differentiable function $f$ defined in a neighbourhood of a point $p = (p_1, p_2, \ldots, p_n)$ such that the gradient $\nabla f(p) = 0$, and the Hessian matrix $H = H(f)(p)$ is indefinite. Thus there exist eigenvectors $u$ and $v$ of $H$ corresponding to eigenvalues $\lambda$ and $\mu$ which are positive and negative respectively. From the Taylor expansion $$ f(p + \epsilon x) = f(p) + \epsilon x^T \nabla f(p) + \frac{\epsilon^2}{2} x^T H x + o(\epsilon^2)$$ we find that $$f(p + \epsilon u) = f(p) + \frac{\epsilon^2 \lambda u^T u}{2} + o(\epsilon^2)$$ and similarly $$f(p + \epsilon v) = f(p) + \frac{\epsilon^2 \mu v^T v}{2} + o(\epsilon^2)$$ so $f(p + \epsilon v) > f(p) > f(p + \epsilon u)$ for sufficiently small $\epsilon>0$, and thus $p$ is a saddle point.