Diagonal Elements of the Hessian matrix and negative definiteness

eigenvalues-eigenvectorshessian-matrixmachine learningpositive definitepositive-semidefinite

I am currently going through this paper, but am slightly stuck on understanding how the authors arrived at theorem 3. For better understanding, I am including the relevant theorems that have been stated in the paper.


Theorem 2

The Hessian $\mathbb{H}_x$ is positive semi-definite


Theorem 3

If $L$ is the largest eigenvalue of $\mathbb{H}_x$, for any
value of $\lambda_2 > L/2$, the second-order interpretation objective
function (5) is strongly concave.


I am fine with the given proof for theorem 2, and I am also fine with the second half of the proof of theorem 3, but I am confused by the following. The authors prove strong concavity of the loss function by subtracting the largest eigenvalue of the Hessian from all of the diagonal elements. Apparently this turns the positive semi-definite Hessian into something that is negative definite.

In the proof (page 11), it is written

Therefore if $\lambda_2 > L/2$, $\mathbb{H}_x – 2\lambda_2\mathbb{I}$ is negative definite.

How does subtracting all the diagonal elements of a positive semi-definite matrix by a value larger than the largest eigenvalue turn it into a negative definite matrix

Best Answer

Since the Hessian is positive semidefinite, we can perform a spectral decomposition and write $H = P\Lambda P^{T}$, such that $P^T = P^{-1}$ and $\Lambda$ is a diagonal matrix with eigenvalues of $H$ on its diagonal. Subtracting the diagonal elements of $H$ by $L$, we have

$$ H - LI = P\Lambda P^{T} - LPP^{-1} = P\Lambda P^{T} - L P I P^{T} = P(\Lambda - LI) P^{T}. $$

Since $L$ is the largest eigenvalue of $H$, the eigenvalues of $H - LI$ are the same as those of $\Lambda - LI$, and the largest of which is $L - L = 0$, so $H - LI$ is negative semidefinite. Thus, if we take any $\epsilon > 0$, the eigenvalues of $H - (L + \epsilon)I$ become strictly negative, so the matrix itself is negative definite since it's also symmetric (or Hermitian, depending on the underlying field).