[Math] Easy conditions that guarantee the convergence of Newton’s method

calculusnewton raphsonnumerical methods

I know that Newton's method was discussed often at this forum, but I am still looking for an easy sufficient condition for the convergence of Newton's method without things like "initial guess close enough".

The idea of Newton's method is to use the iteration

$$x_{n+1}=x_n-\frac{f(x_n)}{f'(x_n)}$$ to get a root of the function
$f(x)$

What are sufficient conditions to guarantee the convergence of this method ?

Wikipedia shows several issues that can occur.

I found some conditions here on the forum and I noticed Kantorovich's theorem. But it is not easy to understand what this means in practice.

I only want to have conditions for the one-dimensional case, and the convergence need not to be quadratic. And it does not matter to which root the method converges (I only want to be sure that it converges at all).
It can also be assumed that $f(x)$ is defined on an interval $[a,b]$ and that this interval contains at least one root of $f(x)$

Some special cases :

  • Is it enough that $f'(x)\ne 0$ for all $x\in [a,b]$ and either $f''(x)>0$ for all $x\in [a,b]$ or $f''(x)<0$ for all $x\in [a,b]$ to guarantee the convergence for every starting point in $[a,b]$ ?

  • Is it enough that either $f'(x)>0$ for all $x\in [a,b]$ or $f'(x)<0$ for all $x\in [a,b]$ , and that $f'(x)$ is bounded in $[a,b]$ to guarantee the convergence for every starting point in $[a,b]$ ?

If no, how can I modify these conditions so that they become sufficient ?

Best Answer

Partial answer : Neither condition is sufficient. Let $f:R \to R$ be twice differentiable . Let $a\in (0,\pi /6)$ and $b=\pi /4.$ Let $f(x)=-\sqrt 3 /2+\cos x$ for $x\in [a,b].$ If $a$ is close enough to $0$ and the starting point $x_1$ is close enough to $a$ then the slope $f'(x_1)=-\sin x_1$ will be so close to $0$ that $x_2=x_1-f(x_1)/f'(x_1) >b.$ The behavior of $f$ on $(b,\infty)$ is indeterminate (except that $f$ is twice differentiable on $R$). For example we could have $f'(x_2)=0\ne f(x_2),$ and then the next iterate $x_3$ doesn't exist. If some iterate $x_n$ can fail to lie in $[a,b]$ then conditions on $f(x),f'(x),$ and $f''(x)$ for $x'\in [a,b]$ will not suffice.

For differentiable $f$:

One well-known sufficient set of conditions is (i) $f(a)<0<f(b),$ (ii) $f'(a)>0$ and $f'$ is increasing (not necessarily strictly increasing ) on $[a,b],$ (iii) $a-f(a)/f'(a)\leq b.$.... Then there is a unique $x_0\in (a,b)$ such that $f(x_0)=0.$ Let $x^*=x-f(x)/f'(x).$ If $x\in [a,x_0)$ then $x^*\in [x_0,b].$ If $x\in (x_0,b]$ then $x^*\in [x_0,x).$

Another well-known sufficient set of conditions is (i) $f(a)<0<f(b),$ (ii) $f$ is monotonic on $[a,b],$ (iii) $f$ is concave on $[a,x_0),$ and $f$ is convex on $(x'_0,b],$ where $x_0=\min \{x\in [a,b]: f(x)=0\}$ and $x'_0=\max \{x\in [a,b]:f(x)=0\}.$... If $x\in [a,x_0)$ then $x^*\in (x,x_0].$ If $x\in (x'_0,b]$ then $x^*\in [x'_0,x).$ (With $x^*$ as in the previous paragraph.)

Proof for the 1st set of conditions:

  1. $f'>0$ on $[a,b]$ so $f$ is strictly increasing and continuous on $[a,b]$ with $f(a)<0<f(b).$ This implies the existence and uniqueness of $x_0.$
  2. If $x\in (a,x_0)$ then $f(a)<f(x)<0$ and $f'(x)\geq f'(a)>0.$ So the line thru the point $(x,f(x))$ with slope $f'(x)$ will intersect the $x$-axis at $(x^*,0) ,$ while the line thru $(a,f(a))$ with slope $f(a)$ will intersect the $x$-axis at $(a^*,0),$ where $x<x^*<a^*\leq b.$
  3. To show that $x\in [a,x_0)\implies f(x^*)\geq 0$ :Observe that $(f(x^*)-f(x))/(x^*-x)=f'(y)$ for some $y\in (x,x^*).$ And $x<x^*$ so $f(x)<f(x^*)$. And we have $$f(x)<f(x^*)<0\implies 0<\frac {f(x^*)-f(x)}{-f(x)}<1 . $$ But then we have $y>x$ and$$f(x^*)<0\implies \ f'(y)= \frac {f(x^*)-f(x)}{x^*-x}=$$ $$=\frac {f(x^*)-f(x)}{-f(x)/f'(x)}= \frac {f(x^*)-f(x)}{-f(x)} \cdot f'(x)<f'(x),$$ contradicting the fact that $f'$ is increasing on $[a,b].$
  4. To show that $x\in (x_0,b]\implies f(x^*)\in [x_0,x)$: The slope $S$ of the line joining $(x_0,f(x_0)=(x_0,0)$ to $(x,f(x))$ is equal to $f'(y)$ for some $y\in (x_0,x),$ so $0<S\leq f'(x).$ So the line thru $(x,f(x))$ with slope $f'(x)$ will intersect the $x$-axis at $(x^*,0),$ with $x_0\leq x^*<x.$
  5. Finally if $x_1\in [a,b]$ and $x_{n+1}=x^*_n,$ then $x_0\leq x_{n+1}\leq x_n\leq b$ for $n\geq 2.$ So $(x_n)_{n\in N}$ converges to a limit $y\in [a,b].$ And since $f'(x)\geq f'(a)>0$ for all $x\in [a,b]$ we have $$0=\lim_{n\to \infty} |f(x_{n+1})-f(x_n)|= \lim_{n\to \infty}|f(x_n)/f'(x_n)|\geq \lim_{n\to \infty}|f(x_n)|/f'(a)=|f(y)|/f'(a).$$ So $f(y)=0.$