[Math] Square roots by Newton’s method

approximationfloating pointnewton raphsonnumerical methods

The following Python program implements Newton’s method for computing the square root of a number:

def sqrt(x):
    def sqrt_iter(guess):
        return guess if good_enough(guess) else sqrt_iter(improve(guess))
    def good_enough(guess):
        tolerance = 1e-3
        return abs(guess**2 - x) < tolerance
    def improve(guess):
        return (guess + x/guess)/2
    initial_guess = 1.0
    return sqrt_iter(initial_guess)


print(sqrt(0))
print(sqrt(1e-12))
print(sqrt(1e-10))
print(sqrt(1e-8))
print(sqrt(1e-6))
print(sqrt(1e-4))
print(sqrt(1e-2))
print(sqrt(1e0))
print(sqrt(1e2))
print(sqrt(1e4))
print(sqrt(1e6))
print(sqrt(1e8))
print(sqrt(1e10))
print(sqrt(1e12))
print(sqrt(1e13))

Output:

0.03125
0.031250000010656254
0.03125000106562499
0.03125010656242753
0.031260655525445276
0.03230844833048122
0.10032578510960605
1.0
10.000000000139897
100.00000025490743
1000.0000000000118
10000.0
100000.0
1000000.0
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "<stdin>", line 12, in sqrt
  File "<stdin>", line 3, in sqrt_iter
  File "<stdin>", line 3, in sqrt_iter
  File "<stdin>", line 3, in sqrt_iter
  [Previous line repeated 993 more times]
  File "<stdin>", line 6, in good_enough
RecursionError: maximum recursion depth exceeded while calling a Python object

As we can see, this naive program does not perform well:

  • for small numbers (from $x = 10^{-4}$ and below), the tolerance $10^{-3}$ is too large so the result is inaccurate;
  • for large numbers (from $x = 10^{13}$ and above), the tolerance $10^{-3}$ is too small so the program enters an infinite recursion.

Both problems can be solved by redefining the good_enough procedure like this:

def good_enough(guess):
    return guess == 0 or improve(guess) == guess

But discussing the various solutions is not the point of my post. Instead, I would like to reliably predict for a given $x$ if the above naive program will return.

I have not read the IEEE Standard for Floating-Point Arithmetic (IEEE 754), but to my understanding, floating-point numbers are not uniformly distributed on the real line. Their spacing is large for small numbers and small for large numbers (this Wikipedia figure seems to confirm this). In other words, small floating-point numbers are dense and large floating-point numbers are sparse. The consequence of this is that the naive program will enter an infinite recursion if guess has not reached the tolerance range yet and the improve procedure cannot improve the guess anymore (meaning that a fixed point of the improve procedure has been reached), because the new guess would be off the old guess by a distance below the spacing of the old guess.

So to guarantee that the naive program will return for a given $x$, it seems to me intuitively that this predicate must hold:

tolerance > spacing($\sqrt{x}$).

If we choose a tolerance of $10^{-3}$ like in the naive program, that means that the spacing of $\sqrt{x}$ should be less than $10^{-3}$. Consequently, according to the Wikipedia figure above for binary64 floating-point numbers, $\sqrt{x}$ should be less than $10^{13}$ and therefore $x$ should be less than $10^{26}$.

Here are my questions:

  1. Is my predicate correct?
  2. Why does the program enter an infinite recursion from $x = 10^{13}$ whereas my predicate guarantees that it cannot happen below $x = 10^{26}$?

Note. — I am using the CPython interpreter on a 64-bit MacBook Pro so the IEEE 754 binary64 format.

Best Answer

Your program stalls because of the limitations of floating-point arithmetic. In general, when computing approximations $x$ of $\sqrt{\alpha}$, the computed value $\hat{y}$ of the residual $y = x^2 - \alpha$ satisfies $$|y - \hat{y}| \leq \gamma_2 (x^2 + \alpha),$$ where $\gamma_k = \frac{ku}{1 - ku}$ and $u$ is the unit roundoff.* When we get very close to the target, i.e. when $x \approx \sqrt{\alpha}$ and $y \approx 0$, the computed value of the residual will satisfy $$|\hat{y}| \lesssim 4u\alpha.$$ For $\alpha = 10^{13}$ and IEEE double precision floating-point arithmetic, the right hand side is about $4 \times 10^{-3}$ which is larger than the (absolute) tolerance which we are currently using. We can resolve this issue by terminating the iteration when $|\hat{y}| \leq \tau \alpha$, where $\tau$ is some user defined tolerance level. We should be able to do $\tau \approx 10^{-15}$ on machines which have IEEE double precision floating-point arithmetic. It is worth recognizing that we obtain an accurate bound for the relative error because $$\frac{x - \sqrt{\alpha}}{\sqrt{\alpha}} = \frac{x^2 - \alpha}{\sqrt{\alpha} (x + \sqrt{\alpha})} \approx \frac{1}{2} \frac{x^2 - \alpha}{\alpha}$$ is a good approximation when $x$ is close to $\sqrt{\alpha}$.

Edit. — Let $x$ and $\alpha$ denote nonnegative floating-point numbers. In the absence of floating-point overflow/underflow, the computed value $\hat{y}$ of the residual $y = x^2 - \alpha$ can be written as $$\hat{y} = (x^2(1 + \epsilon) - \alpha) (1 + \delta) = y + x^2 (\epsilon + \delta + \epsilon\delta) - \alpha \delta,$$ where $|\epsilon| \leq u$ and $|\delta| \leq u$. It follows that $$|y - \hat{y}| \leq x^2(2u + u^2) + \alpha u.$$ When we get very close to the target, i.e. when $x \approx \sqrt{\alpha}$ and $y \approx 0$, $$|\hat{y}| \lesssim \alpha(2u + u^2) + \alpha u \approx 3u \alpha.$$ From a purely mathematical point of view, the fact that the right-hand side of this inequality exceeds the threshold does not imply that the left-hand side of the of the inequality will also exceed the threshold. Experience, however, suggests, that once we no longer have a mathematical guarantee that the machine will behave, it will seize the opportunity to misbehave.

A closer analysis is possible using Sterbenz lemma regarding the subtraction of floating-point numbers which are sufficiently close to each other. When $x^2$ is a good approximation of $\alpha$, there is no error in the subtraction when computing the residual. It follows that $$\hat{y} = x^2 (1 + \epsilon) - \alpha = y + x^{2} \epsilon \approx x^2 \epsilon.$$ The very best we can hope for is that $|\epsilon| \leq u$, and while it does happen that $\epsilon = 0$, we have no right to expect that $|\epsilon| \ll u$ and in general this is not the case and $|\hat{y}| \approx \alpha u$, something which we can easily verify.


* The unit roundoff $u$ is $u = 2^{-53}$ in IEEE double precision arithmetic, and $u = 2^{-24}$ in IEEE single precision arithhmetic. Machine epsilon $\text{eps}$ is the distance between $1$ and the next floating-point number $1 + 2u$, i.e. $\text{eps} = 2u$. Some authors will use the terms as if they are identical. They are not. When in doubt, follow the notation of Niclas J. Higham’s textbook Accuracy and Stability of Numerical Algorithms. The $\gamma$-factor which I used earlier is also explained in this textbook. Briefly, it is a convenient tool which allows us to simplify rounding error analysis considerably at the cost of a bit of exactness. It was not strictly necessary here, so I decided not to pursue it in these edit.

Related Question