Eigenvalue bound for quadratic maximization with linear constraint

eigenvalues-eigenvectorslinear algebramatricespositive definiteqcqp

_{This builds on my earlier questions here and here.}

Let $B$ be a symmetric positive definite matrix in $\mathbb{R}^{k\times k}$ and consider the problem

$$\begin{array}{ll} \text{maximize} & x^\top B x\\ \text{subject to} & \|x\|=1 \\ & b^\top x = a\end{array}$$

where $b$ is an arbitrary unit vector and $a > 0$ is a small positive number. Let $$\lambda_1 > \lambda_2 \geq \cdots \geq \lambda_k > 0$$ be the eigenvalues of $B$ with corresponding eigenvectors $z_1,…,z_k$. I conjecture that the optimal value of the problem is bounded below by $a^2 \lambda_1 + \left(1-a^2\right)\lambda_2$, at least if $a$ is small enough.

To motivate this conjecture, let us consider two special cases. First, suppose that $a= 0$. Then, as was explained to me in one of my previous posts, the optimal value is between $\lambda_1$ and $\lambda_2$ by the Courant-Fischer theorem. Thus, $\lambda_2$ is a lower bound, and it also coincides with my conjectured lower bound in this special case.

Second, let $a > 0$ but suppose that $b = z_i$ for some $i = 1,…,k$. Any feasible $x$ can be written as

$$x = ab + \sqrt{1-a^2} \cdot \hat{b}$$

where $\hat{b}\perp b$. If $b = z_1$, I can take $\hat{b} = z_2$, and if $b = z_i$ for $i \neq 1$, I can take $\hat{b} = z_1$. Either way, the objective value of $x$ is bounded below by $a^2 \lambda_1 + \left(1-a^2\right)\lambda_2$ as long as $a$ is small enough (note that this requires $\lambda_1 > \lambda_2$).

The difficulty is showing that it holds in the case where $b$ is not one of the eigenvectors of $B$ (perhaps with additional restrictions on how large $a$ can be). My intuition is that, if $b$ is not required to be orthogonal to $x$, but only "almost" orthogonal (meaning that $a$ may be required to be sufficiently small), you should be able to go a bit further in the direction of the principal eigenvector than in the case where $a = 0$.

Here is the most up-to-date work on this problem. In the answer below, it was found that the optimal value $v$ of the problem is a generalized eigenvalue of the system

$$PBx = vPx,$$

which in turn was derived from the system

$$PBPy + aPBb = v Py.$$

Any pair $\left(y,v\right)$ that solves these equations then leads to a feasible $x = ab+Py$, with $v$ being the objective value.

We can write

$$\left(vI – PB\right)Py = aPBb.$$

Note that, for any $v$ that is not an eigenvalue of $PB$, the matrix $vI-PB$ is invertible, whence

$$Py = a\left(vI-PB\right)^{-1}PBb.$$

The normalization $x^\top x = 1$ then becomes $y^\top P y = 1-a^2$, leading to the equation

$$\frac{1-a^2}{a^2} = b^\top BP\left(vI-PB\right)^{-2} PBb.$$

The largest root of this equation is the optimal value of the problem. Perhaps, as suggested, it can be found numerically.

Best Answer

The following analysis explores various approaches to the problem, but ultimately fails to produce a satisfactory solution.

One of the constraints can be rewritten using the nullspace projector of $b$ $$\eqalign{ P &= \Big(I-(b^T)^+b^T\Big) = \left(I-\frac{bb^T}{b^Tb}\right) \;=\; I-\beta bb^T \\ Pb &= 0,\qquad P^2=P=P^T \\ }$$ and the introduction of an unconstrained vector $y$ $$\eqalign{ b^Tx &= a \\ x &= Py + (b^T)^+a \\ &= Py + a\beta b \\ &= Py + \alpha_0 b \\ }$$ The remaining constraint can be absorbed into the definition of the objective function itself $$\eqalign{ \lambda &= \frac{x^TBx}{x^Tx} \;=\; \frac{y^TPBPy +2\alpha_0y^TPBb +\alpha_0^2\,b^TBb}{y^TPy +\alpha_0^2\,b^Tb} \;=\; \frac{\theta_1}{\theta_2} \tag{0} \\ }$$ The gradient can be calculated by a straightforward (if tedious) application of the quotient rule as $$\eqalign{ \frac{\partial\lambda}{\partial y} &= \frac{2\theta_2(PBPy +\alpha_0PBb)-2\theta_1Py} {\theta_2^2} \\ }$$ Setting the gradient to zero yields $${ PBPy +\alpha_0PBb = \lambda Py \tag{1} \\ }$$ which can be rearranged into a generalized eigenvalue equation. $$\eqalign{ PB\left(Py+\alpha_0b\right) &= \lambda Py \\ PBx &= \lambda Px \tag{2} \\ }$$ Note that multiplying the standard eigenvalue equation $$\eqalign{ Bx &= \lambda x \tag{3} \\ }$$ by $P$ reproduces equation $({2})$. So both standard and generalized eigenvalues are potential solutions.

Unlike the discrete $\lambda$ values yielded by the eigenvalue methods, equation $({1})$ is solvable for a continuous range of $\lambda$
$$\eqalign{ y &= \alpha_0(\lambda P-PBP)^+PBb \\ }$$ and produces a $y$ vector which satisfies the zero gradient condition $({1})$.

Unfortunately, none of these approaches yields a solution which satisfies all of the constraints.

But solving equation $(0)$ for an optimal $y$ vector is still the appropriate goal, and requires a numerical rather than an analytical approach.

Best Answer

Related Solutions

Eigenvector problem with ellipsoids (maximizing quadratic form)

How do you derive the eigenvalue and eigenvector of a linear combination of outer products

Related Question