[Math] Eigenvectors and eigenvalues of Hessian matrix

eigenvalues-eigenvectorshessian-matrix

Because the Hessian matrix is real and symmetric, we can decompose it
into a set of real eigenvalues and an orthogonal basis of
eigenvectors. The second derivative in a specific direction
represented by a unit vector d is given by $d^T Hd$. When d is an
eigenvector of H , the second derivative in that direction is given by
the corresponding eigenvalue. "

I didn't understand why
"The second derivative in a specific direction represented by a unit vector d is given by $d^T Hd$".

Best Answer

As I think you are asking for intuition regarding "The second derivative in a specific direction represented by a unit vector d is given by $d’Hd$”, let me correlate it in two ways with the normal way we think about derivatives. I’ll use two dimensions to illustrate in both cases. Let the unit vector $\bar{d}$ be $(n_1,n_2)$ in the standard basis and let $\bar{x}$ represent the point (x,y).

For the shorter explanation, consider the function value $f(\bar{x}+ds \bar{d})$ at a small distance $ds$ from $\bar{x}$ along $\bar{d}$ as a Taylor expansion. Let $h=n_1ds$ and $k=n_2ds$ denote the corresponding increments along the x and y directions.

$$f(\bar{x}+ds \bar{d})=f(x,y) + hf_x+kf_y + \frac{1}{2}(h^2f_{xx}+ 2hkf_{xy}+ k^2f_{yy}) + \mbox{h.o.t.}$$

$$=f(x,y) + ds(n_1f_x+n_2f_y) + \frac{1}{2}ds^2(n_1^2f_{xx}+ 2n_1n_2f_{xy}+ n_2^2f_{yy}) + \mbox{h.o.t.}$$

$$=f(x,y) + ds (\nabla f \cdot \bar{d} )+ \frac{1}{2}ds^2 (\bar{d}’H \bar{d} )+ \mbox{h.o.t.}$$

That is, $\nabla f \cdot \bar{d}$ plays the role of the first derivative and $\bar{d}’H \bar{d}$ plays the role of the second derivative along the direction $\bar{d}$.

The second explanation is using the same idea but depending on your bent of mind, might be more intuitive. Proceeding as in finite differences, where $f_x$ is approximated by $\frac{f(x+\Delta x)-f(x)}{\Delta x}$ with the approximation becoming exact as $\Delta x \rightarrow 0$. Then the second derivative $f_{xx}$ is likewise approximated by $$\frac{ f_x(x+\frac{\Delta x}{2}) - f_x(x -\frac{\Delta x}{2}) }{\Delta x}$$

$$~ \frac{ f( x + \Delta x) -2f(x) + f( x - \Delta x) }{\Delta x^2}$$ Now, apply that one dimensional second derivative idea along the direction $\bar{d}$ to see that, ignoring higher order terms for now, the second derivative is

$$ \frac{ f( x + h, y+ k) -2f(x) + f( x - h, y-k }{ h^2 + k^2}$$

Using 2 dimensional Taylor expansions for $f( x + h, y+ k)$ and $ f( x - h, y-k )$ (write it out)

and using $h=n_1ds$ and $k=n_2ds$, we see that the second derivative approximation is given by

$$ds^2 \frac{ n_1^2f_{xx}+ 2n_1n_2f_{xy}+ n_2^2f_{yy} }{ ds^2} = ds^2 \frac{ \bar{d}’H \bar{d} }{ ds^2} = \bar{d}’H \bar{d} $$ with the higher order terms vanishing as you take $ds$ to zero.

I would have liked to expand some of the steps more, but MathJax on a phone is rather painful. I hope one of these explanations felt intuitive to you. Please leave a comment if more clarification is needed.