[Math] reason to prefer Newton’s method for the computation of the roots

numerical methodsrootssoft-question

The Newton method for root computation is adored for its "dramatic" convergence speed, which is quadratic (the number of exact decimal doubles on every iterations). It is preferred over the secant method (and variants) featuring only φ-order convergence.

But

Newton's method requires one evaluation of the function plus one evaluation of its derivative per iteration. In many cases, the derivative is as costly to evaluate (and often more) so that there are actually two evaluations per iterations, and this makes the convergence speed in fact √2-order.

In addition, Newton gives no guarantee of convergence to a root close to the initial guess, whereas the regula falsi variant guarantees to converge inside the interval where a change of sign was observed.

So my question: why is Newton preferred at all?

Best Answer

I've written a couple of root finders which employ Newton's method.

My experience is that if you have no clue where a root is, Newton's method will turn on you. Much of my time writing these methods is spent scouring the literature for asymptotics which bracket the roots. Even with the asymptotics, it's not enough to just blindly apply Newton's methods to (say) the average of the brackets; first a few iterations of bisection is required to get the root to an accuracy of (say) 1 part in 100.

As to the second concern about the cost of evaluation of $f$ and $f'$: It is not generally that case that you must evaluate the function and its derivative independently. Generally you wish to make a routine that evaluates both at the same time. This is particularly important for evaluation of (say) a power series where the coefficients must be transferred from cache (or worse RAM) to registers. For a power series, it's easy to write a routine that will evaluate both $f$ and $f'$ at once with very little overhead relative to evaluation of $f$.

Finally, there is a very good reason to switch from bisection to Newton's method, even if you aren't interested in speed. Bisection must evaluate a function very near a root to be accurate. But the condition number of function evaluation is unbounded at the root, leading to large error. Newton's method suffers from this problem as well, but from my observation Newton's method exploits the differentiable structure of the function and will recover every digit correct up to the precision of the type. (I cannot prove this statement but I do have unit tests which show it's validity for a few cases.)

Related Question