Simple question about infeasible start Newton method for equality constraint problems.

algorithmsconvex optimizationlagrange multiplieroptimization

I am reading about infeasible start Newton method for equality constraint problems in the book Convex Optimization, written by Stephen Boyd and Lieven Vandenberghe. The algorithm is stated as follows:

enter image description here

My concern. I know that when it is stated "given $x \in \text{dom } f$", for example, it is meant that we can pick any vector $x$ to start our method, taking into account that it must belong to the domain of the function we are looking to minimize. The same logic can be applied to the choice of $\epsilon > 0,$ and to the constants $\alpha,\beta.$

On the other hand, I am not so sure this is true about $v$. I understand this $v$ is the Lagrange multiplier of the underlying optimization problem. Let us consider a generic problem of the form $$ \min f \\ \text{subject to } Ax = b $$
Then, the Lagrangian function is given by
$$ L(x,v) = f + v^\top(Ax-b).$$
Deriving and equaling to zero, it follows that
$$ \nabla f + A^\top v = 0. $$
So, shouldn't the element $v$ satisfy this condition? The way the algorithm is stated makes me think that I can pick an arbitrary $v$.

Furthermore, is this $v$ just a number or a vector?

Thanks for any help in advance.

Best Answer

As Elias mentioned in a comment, $\nu$ is indeed a vector in $\mathbb{R}^p$, where $x\in\mathbb{R}^n$ and $A\in\mathbb{R}^{p\times n}$. To answer your primary question, no, the starting point $\nu_0$ does not have to satisfy the optimality condition. The reason is that we are using Newton's method to iteratively update both $x$ and $\nu$ to minimize the combined primal and dual residual. That is, update $x$ and $\nu$ to successsively decrease the norm of the following function:

$$ r(x,\nu) = \begin{pmatrix}\nabla f + A^\top\nu\\ Ax-b\end{pmatrix}. $$

The final values of $x$ and $\nu$ will satisfy the primal and dual optimality conditions up to the tolerance of the Newton iteration if it converges. However, it is not true that the initial guesses $x_0$ and $\nu_0$ can be completely arbitrary for a general $f$. Convergence is only guaranteed locally in general and one must typically make an initial guess close enough to the true solution, where "close enough" is determined by the Lipschitz constant of the Hessian of $f$.

Related Question