[Math] Second-smallest eigenvalue as $\min_x \frac{x^TAx}{x^Tx}$

In Mining Massive Datasets, page 365, the following theorem is stated without proof:

Let A be a symmetric matrix. then the second-smallest eigenvalue of A is equal to
$\displaystyle \min_{x} {x^TAx}$, when the minimum is taken over all unit vectors x that are orthogonal to the eigenvector $v_1$ associated with the smallest eigenvalue.

This phrasing seems somewhat sloppy, because let's look at the case where the smallest eigenvalue has an eigenspace of dimension $>1$. I think we will not really get the second-smallest eigenvalue $\lambda _2$ but the smallest eigenvalue $\lambda _1$ itself, because we can take another unit eigenvector of $\lambda _1$ that is orthogonal to $v_1$, and it will satisfy the minimization problem. ($x^TAx=x^T \lambda x =\lambda$)

So probably the right two theorems are:

The $k^{th}$ smallest (distinct) eigenvalue of L is equal to
$\displaystyle \min_{x} {x^TAx}$, when the minimum is taken over all unit vectors x that are orthogonal to all of the eigenvectors associated with the smallest $k-1$ eigenvalues.

The second smallest eigenvalue of L counted with multiplicities according to the dimensions of the eigenspaces is equal to
$\displaystyle \min_{x} {x^TAx}$, when the minimum is taken over all unit vectors x that are orthogonal to an eigenvector associated with the smallest eigenvalue.

For the $k^{th}$ smallest in the second proposition, we ask for orthogonality to all eigenvectors of the smaller eigenvalues.

Is it right? And how do I prove in both cases that $\displaystyle \min_{x} {x^TAx} \leq \lambda _k$?

This answer shows that without the second constraint we get the smallest eigenvalue, because the $x$ that gives the minimum is an eigenvector of $\frac{A+A^T}{2}=A$, and hence gives as the value of the quadratic form its associated eigenvalue. The answer there also seems to hint at the necessity of an assumption on the positive semi-definiteness of $A$; is it needed? The book does not require it.

Best Answer

The name "min-max theorem" given in the comments lead to the following very elegant proof.

To repeat in our context: given a symmetric matrix A, we construct an orthonormal basis $\{ v_1,b_2,...,b_n \}$ to $\mathbb{R}^n$ using the eigenvectors of A (which is possible due to the symmetry of A), using the eigenvector $v_1$ from the theorem. As we only look at the space orthogonal to $v_1$ in the minimization, $\{ b_2,...,b_n \}$ is an orthonormal basis.

We note that the dot product can be done coordinate-wise in any orthonormal basis, and hence for all unit vectors $x=(x_2,...,x_n)$ in this orthogonal space: $${x^TAx}=(\sum_{j=2}^n x_j b_j^T)A(\sum_{i=2}^n x_ib_i)=\sum _{i,j}x_i x_j b_j^T A b_i=\sum _{i,j}x_i x_j b_j^T \lambda _i b_i$$ As $Ab_i=\lambda_i b_i$. Now we use $b_j^T b_j=\delta_{ij}$, as they compose an orthonormal basis, to get: $$=\sum _{i}\lambda_i x_i^2 \geq \sum _{i}\lambda_2 x_i^2=\lambda_2$$

Since $x$ is of unit length. we proved that for all $x$ on which we minimize, ${x^TAx} \geq \lambda_2$, and hence $\min{x^TAx} \geq \lambda_2$. The other direction is easy and in written in the question itself. $\square$

The proof assumed indeed that we count the eigenvalues according to their algebraic multiplicity, $\lambda_1 \leq ... \leq \lambda _n$.

Best Answer

Related Solutions

[Math] Linear combination of eigenvectors corresponding to the same eigenvalue

[Math] Basis for the eigenspace of each eigenvalue, and eigenvectors

Related Question