[Math] Steepest descent with quadratic form converges in 1 iteration

linear algebranumerical methodsoptimizationquadratic-forms

Well I'm stuck on an exercise given:

The steepest descent method is applied to the quadratic form

$$Q(\mathbf{x}) = \tfrac{1}{2}\mathbf{x}^TA\mathbf{x} – \mathbf{b}^T\mathbf{x} + c$$
where $A$, $\mathbf{b}$ and $c$, are matrix, vector and scalar constants. Under what condition on the matrix $A$ does the steepest descent method converge to the exact minimum in 1 iteration, from any initial condition $x_0$? [Hint: If the initial search line $\mathbf{x}_0 + \alpha \mathbf{d}_0$ includes the exact minimum of $Q(\mathbf{x})$, then the method will converge in 1 iteration.]

More specifically, some conditions about the matrix $A$ are asked, seeing this is a multiple choice question, the options:

  1. $A$ is a multiple of the identity matrix
  2. $A$ is diagonal
  3. $A$ is symmetric
  4. $A$ is positive definite
  5. $A$ has only positive eigenvalues
  6. $A$ is equal to $\mathbf{b}\mathbf{b}^T$
  7. It never converges in 1 iteration
  8. It always converges in 1 iteration

Now following the "hint", the exact solution is given by solving $Q'(\mathbf{x}) = 0$:
$$A\mathbf{\tilde{x}} – \mathbf{b} = 0$$
$$\mathbf{\tilde{x}} = A^{-1} \mathbf{b}$$

From the definition of the steepest descent method ($x_{n+1} = x_n – \alpha f'(x_n)$):

$$\mathbf{x}_1 = \mathbf{x}_0 + \alpha (A\mathbf{x}_0-\mathbf{b})$$

Now $\mathbf{x}_1 = \mathbf{\tilde{x}}$ has to be solved, or written out:

$$\mathbf{x}_0 + \alpha (A\mathbf{x}_0-\mathbf{b}) = A^{-1} \mathbf{b} \qquad \forall\ \mathbf{x}_0,\alpha \in \mathbb{R}$$

Now I'm stuck, what can be said about $A$? Geometrically it feels like solution should be easy to be found, an example is a "cone like surface".

Best Answer

In order to have existence of a minimizer, $A$ must be positive semi-definite and the equation $Ax=b$ must be solvable.

Assume the steepest descent converges in one step, i.e. $x_1$ is the solution. Then $$ 0=Ax_1-b=A(x_0 - \alpha(Ax_0-b))-b = (I-\alpha A)(Ax_0-b) $$ hence $Ax_0-b$ has to be an eigenvector if $A$ (or $Ax_0-b=0$). The reverse direction is also true, you need that $Ax_0-b$ is an eigenvector to a positive eigenvalue of $A$.

Can you find now, which of the given choices imply one-step convergence?