[Math] Solving a matrix equation using numerical optimization

linear algebramatricesnumerical linear algebranumerical optimizationoptimization

To my knowledge, if $A \in \mathbf{S}^n_{++}$, then given any $b \in \mathbb{R}^n$, the system of linear equations $Ax = b$ has a unique solution $x^* \in \mathbb{R}^n$. Moreover, the solution $x^* \in \mathbb{R}^n$ of $Ax = b$ minimizes the objective function
$$f(x) = \frac{1}{2}x^TAx – b^Tx$$
It is known that $f$ is a convex smooth function. The gradient and the Hessian of $f$ are correspondingly
$$\nabla f(x) = Ax – b$$
$$\nabla^2 f(x) = A \succeq 0$$
So in fact the unique solution $x^*$ of $Ax = b$ satisfies the optimality conditions. Although it is stupid, we can use any convex optimization method on $f$ in order to find the solution $x^*$.


Now my question is what if $A$ is a general nonsingular square matrix (not necessarily SPD) and we follow the same idea to define an objective function
$$f(x) = \frac{1}{2}x^TAx – b^Tx$$
where $b \in R(A) = \mathbb{R}^n$. Then

  1. Is the unique solution $x^*$ of $Ax = b$ an optimal point of $f$?
  2. What kind of optimization methods would converge to $x^*$ given any initial guess (with the fastest rate)?
  3. Any improvements if $A$ has a positive real symmetric part?

My guess for the first question is $x^*$ may be a saddle point of $f$ which satisfies the first-order optimality condition, but not the second-order. I don't know anything about non-convex optimization so I have no ideas for the second and the third question. I know there are numerical linear algebra methods like conjugate gradient which converges for SPD matrix and GMRES which converges for any matrix, but I don't have much knowledge about the underlying principle and which methods are the best ones either.

Any ideas and reference are welcome. Thanks in advance. 🙂

Best Answer

There are several mistakes in the body of question.

  1. $\nabla(f)(x)=2Ax-b$.

Assume that $A$ is a square matrix. Then $x^*$ is a critical point of $f$ iff $(A+A^T)x^*=b$; as Thoth wrote, putting $A:=A+A^T$, we may assume that $A$ is symmetric, $f(x)=1/2x^TAx-b^Tx$ and the required equation becomes (*) $Ax^*=b$.

2 $A$ must be invertible, otherwise there is no solutions or an infinity of solutions in $x^*$.

Your equation (*) is equivalent to $Bx^*=c$ where $B=A^2$ -a symmetric $> 0$ matrix- and $c=Ab$. In this way, the second problem is reduced to the first one: find the minimum of $1/2x^TBx-c^Tx$.

Related Question