Newton’s Method convergance

convergence-divergencenewton raphsonnumerical optimizationoptimization

Lets say I have a function

$f(x) = x^TAx+b^Tx+c$

Using Newton's method to find a minimum, I get $$x_1 = x_0 + \frac{x^TAx+b^Tx+c}{f'(x)}$$

How would I prove that this converges in one iteration, regardless of initial solution ($x_0$)? Obviously the exact solution is the quadratic formula but I can't see how after just one iteration you can see that it has converged.

Best Answer

Note that the version you wrote is Newton's method for finding the zero of a function $f$. To find the minimum of a function, we look for zeros of $\nabla f$ (the gradient of $f$). If $A$ is symmetric and positive definite, then your function $f$ is a strictly convex function and therefore a zero of $\nabla f$ will be the unique global minimizer of $f$. Therefore, if we use Newton's method to find a zero of $\nabla f$ and the scheme converges, then we have successfully minimized $f$. Now, writing Newton's method to find a zero of $\nabla f$, we obtain \begin{equation*} x_{k+1} = x_k - \nabla^2 f(x_k)^{-1} \nabla f(x_k), \end{equation*} where $\nabla^2 f$ is the Hessian of $f$. Of course, this scheme relies on the assumption that $f$ is twice differentiable and that the Hessian is nonsingular at all points $x_k$ in our sequence of iterates. For $f(x) = x^\top Ax + b^\top x + c$ with $A\succ 0$, both of these assumptions are satisfied.

Now let's prove the one-step convergence. Note that, for symmetric $A$, it holds that $\nabla f(x) = 2Ax + b$ and $\nabla^2 f(x) = 2A$. Let $x_0\in\mathbb{R}^n$. Then our first iterate becomes \begin{align*} x_1 ={}& x_0 - (2A)^{-1}(2Ax_0 + b) \\ ={}& x_0 - \frac{1}{2}A^{-1}(2Ax_0+b) \\ ={}& x_0 - x_0 - \frac{1}{2}A^{-1}b \\ ={}& -\frac{1}{2}A^{-1}b. \end{align*} We can verify that this indeed is the global minimizer by analytically minimizing $f$: \begin{equation*} \nabla f(x^*) = 2Ax^* + b = 0 \end{equation*} implies that the unique global minimizer is \begin{equation*} x^* = -\frac{1}{2}A^{-1}b = x_1. \end{equation*} Hence, the scheme converges to the global minimizer in one step.