Relationship between minimization of quadratic form in $\mathbb R^2$ and second eigenvalue/eignvector

eigenvalues-eigenvectorsquadratic-forms

I have come across the following statement in https://www.sciencedirect.com/science/article/abs/pii/0031320394001256

The statement (in the right middle of Page 785) is (essentially) the following:

The vector $\boldsymbol{x}\in\mathbb{R^2}$ minimizing $\boldsymbol{x}^\top\boldsymbol{A}\boldsymbol{x}$, where $\boldsymbol{A}$ is a symmetric positive definite two-dimensional matrix is the same as the eigenvector corresponding to the smaller eigenvalue of $\boldsymbol{A}$.

I understand that the eigenvector corresponding to the larger eigenvalue maximizes the quadratic form. The eigenvector corresponding to the second (smallest in this case) eigenvalue, I thought, was the maximizer of the quadratic form that was also orthogonal to the eigenvector corresponding to the first eigenvalue. Is this also the minimizer? I am looking for a formal proof/explanation.

Best Answer

You seem to be very distracted by the $\mathbb{R}^2$ so let's generalize to $\mathbb{R}^n$. Write a unit vector $x = \sum_{i=1}^n c_i v_i$ as a sum of orthonormal eigenvectors of $A$, so that $\| x \|^2 = \sum c_i^2$. We assume that the eigenvalues have been sorted such that $\lambda_1 \ge \dots \ge \lambda_n$. Now write

$$\langle x, Ax \rangle = \left\langle \sum_{i=1}^n c_i v_i, \sum_{i=1}^n c_i \lambda_i v_i \right\rangle = \sum_{i=1}^n \lambda_i c_i^2.$$

Now the minimization problem is: what unit vector $x$ (you need this condition or else the minimum is just given by $x = 0$) minimizes this sum? Equivalently, we want to minimize $\sum \lambda_i c_i^2$ given the constraint $\sum c_i^2 = 1$. This is a very easy Lagrange multiplier calculation and the answer is that we should take $c_n^2 = 1$ and all other $c_i^2 = 0$, with minimum value $\lambda_n$, the smallest eigenvalue.

When $n = 2$ this smallest eigenvalue is $\lambda_2$ which also happens to be the second largest eigenvalue and hence which also solves a maximization problem.

It's not that "the maximization problem becomes a minimization problem," it's that there are two problems, a maximization problem and a minimization problem, and $\lambda_2$ happens to be the answer to both of them, because it happens to be both the smallest and the second largest eigenvalue in this case. But the distinction becomes clear for larger values of $n$ where $\lambda_2 \neq \lambda_n$.

More generally every singular value can be characterized as both the solution to a maximization problem and the solution to a minimization problem; see e.g. this blog post for one way to do this (search for "variational").

Related Solutions

Show the relationship between the eigenvector of $A^TA$ and $AA^T$

You are doing it too complicated. You have $$ AA^Tu=\frac1\sigma\,A(A^TAv)=\frac1\sigma\,A(\sigma^2v)=\sigma^2\,\frac1\sigma\,Av=\sigma^2u. $$ And, to check that $u$ is a unit vector, $$ u^Tu=\frac1{\sigma^2}(Av)^TAv=\frac1{\sigma^2}\,v^TA^TAv=v^Tv=1. $$ There is not need to use the singular value decomposition.

Maximization of quadratic form subject to a set constraint

You can directly brute force this problem: Given a function $f\colon\mathbb R^N \to\mathbb R, \theta\mapsto f(\theta)$ with Hessian $H=\frac{\partial^2 f}{\partial^2 \theta}$, then the quadratic form $x^T H x$ can be computed without explicitly computing $H$:

$$ x^T H x = x^T \frac{\partial^2 f}{\partial^2 \theta} x = x^T\frac{\partial}{\partial \theta} \Big(\frac{\partial f}{\partial \theta} x\Big) $$

Once you implement $f$ in a language that supports automatic differentiation such as pytorch or tensorflow, you can compute it efficiently via

  z = grad(f, theta).dot(x)
xHx = grad(z, theta).dot(x)

At no point in this process is the matrix $H$ explicitly constructed. If you have enough RAM you can also concatenate all $x$ vectors and run everything in a single swoop (using appropriate tensorcontraction instead of inner product .dot).

Best Answer

Related Solutions

Show the relationship between the eigenvector of $A^TA$ and $AA^T$

Maximization of quadratic form subject to a set constraint

Related Question