[Math] How to determine which representation of a function to use for Newton’s method

linear algebranumerical methods

Take the equation:

$$x^2 + 2x = \frac{1}{1+x^2}$$

I subtracted the right term over to form $~f_1(x)~$:

$$x^2 + 2x – \frac{1}{1+x^2} = 0$$

I wanted to take the derivative, so I rearranged things to make it a bit easier, call it $~f_2(x)~$:

$$x^4 + 2x^3 + x^2 + 2x – 1 = 0$$

I noticed when I graphed $~f_1(x)~$ and $~f_2(x)~$ that their plots were different $($although they shared the same solution for $~x)~$.

Newton's method iterates down the graph line, so I'd imagine that Newton's method for these two equations are not equivalent. They'd find the same solution, but they would get there different ways. In that case, is there a way to decide which equation to use for Newton's to obtain the best/quickest result?

Best Answer

For your curiosity.

Asking this far-from-innocent question, you are almost asking me what I have been doing during the last sixty years.

My answer is : what is the transform of $f(x)$ which makes it the most linear around the solution ? If you find it, you will save a lot of iterations.

One case I enjoy for demonstration is $$f(x)=\sum_{i=1}^n a_i^x- b^x$$ where $1<a_1<a_2< \cdots < a_n$ and $b > a_n$; on this site, you could find many problems of this kind.

As written, this function is very bad since it varies very fast and it is very nonlinear. Moreover, it goes through a maximum (not easy to identify) and you must start on the right of it to converge.

Now, consider the transform $$g(x)=\log\left(\sum_{i=1}^n a_i^x \right)- x\log(b)$$ It is almost a straight line ! This means that you can start from almost anywhere and converge fast.

For illustration purposes, trying with $n=6$, $a_i=p_i$ and $b=p_{n+1}$ ($p_i$ being the $i^{th}$ prime number). Being very lazy and using $x_0=0$, the iterates would be $$\left( \begin{array}{cc} n & x_n \\ 0 & 0 \\ 1 & 1.607120621 \\ 2 & 2.430003204 \\ 3 & 2.545372693 \\ 4 & 2.546847896 \\ 5 & 2.546848123 \end{array} \right)$$ Using $f(x)$, you must start iterating with $x_0 > 2.14$ to have convergence (big work to know it !). Let us try with $x_0=2.2$ to get as successive iterates

$$\left( \begin{array}{cc} n & x_n \\ 0 & 2.200000000 \\ 1 & 4.561765400 \\ 2 & 4.241750505 \\ 3 & 3.929819520 \\ 4 & 3.629031502 \\ 5 & 3.344096332 \\ 6 & 3.082467015 \\ 7 & 2.856023930 \\ 8 & 2.682559774 \\ 9 & 2.581375720 \\ 10 & 2.549595979 \\ 11 & 2.546866878 \\ 12 & 2.546848124 \\ 13 & 2.546848123 \end{array} \right)$$

The last point I would like to mention : even if you have a good guess $x_0$ of the solution, search for a transform $g(x)$ such that $g(x_0)\,g''(x_0) >0$ (this is Darboux theorem) in order to avoid any overshoot of the solution during the path to it.

Related Question