[Math] What does the derivative have to do with slow convergence in Newton’s Method

calculusconvergence-divergencenewton raphson

I am looking for an intuition for the following behavior:

Let $f(x)=x^2$

Apparently the Newton's Method iterations to find the root (at $x=0$) converge in this case relatively slow: $x_{x+1}=x_n-\frac{f(x)}{f'(x)}$. I was told this happens because of the fact that the derivative at the root is also zero: $f'(0)=0$

However, I don't understand why the derivative would matter:

$$x_{n+1}=x_n-\frac{x_n^2}{2x_n}=\frac{x_n}{2}$$ I would say that because the root is $0$ therefore $x_n$ becomes smaller and smaller and therefore the steps towards the 'right' answer become smaller and smaller.

a) Is this reasoning correct?

b) What does all this have to do with the derivative?


I created some graphics:

First $f(x)=x^2$ with $x_0=2$

And $f(x)=x^2-1$ with $x_0=3$

enter image description here

Indeed, with the double root the convergence goes slower. Still I do not really have an intuition for this behavior. I can see the calculations confirm this result, but why is this also intuitively true?

Best Answer

Newtons method is using a first degree approximation of the function to find the roots of a function.

Lets look at the Taylor series.

$f(x) = f(a) + f'(a) (x-a) + 1/2 f''(a) (x-a)^2 + O(x^3)$

If $f'(x)$ is large relative to the other derivatives of $f(x)$. Then a linear approximation is going to be pretty good $f(x)= f(a) + f'(a) (x-a) + \epsilon.$ but if $f'(a)$ is small, it isn't.

Newton's method is just the first degree Taylor approximation flipped on its head.

$f(x)= f(a) + f'(a) (x-a) + \epsilon = 0\\ f'(a) x = -f(a) + a f'(a)\\ x = a - f(a)/f'(a)$

Perhaps a geometric argument may be more intuitive. enter image description here

If you start at $(x,f(x))$ as your initial guess at the root. Newton's method will take you to $x_1$, which is not particularly close to the actual root.

Related Question