I think you can avoid the Newton-Raphson altogether, since cubic is solvable in radicals.
Here is the complete algorithm (working under constraints outlined in the problem), in Mathematica:
Clear[PositiveCubicRoot];
PositiveCubicRoot[p_, q_, r_] :=
Module[{po3 = p/3, a, b, det, abs, arg},
b = ( po3^3 - po3 q/2 + r/2);
a = (-po3^2 + q/3);
det = a^3 + b^2;
If[det >= 0,
det = Power[Sqrt[det] - b, 1/3];
-po3 - a/det + det
,
(* evaluate real part, imaginary parts cancel anyway *)
abs = Sqrt[-a^3];
arg = ArcCos[-b/abs];
abs = Power[abs, 1/3];
abs = (abs - a/abs);
arg = -po3 + abs*Cos[arg/3]
]
]
Then
In[222]:= PositiveCubicRoot[-2.52111798, -71.424692, -129.51520]
Out[222]= 10.499
However, if the Newton-Raphson method must be used, then a good initial guess is imperative. Binary division is a good method to isolate the root in the case at hand.
We start with an arbitrary point $x$, I chose $x=1$, and double it while the polynomial
at that point is negative. Then do binary division a certain number of times (the code below does it twice). Ultimately, polish it off with Newton-Raphson iterations:
In[283]:=
NewtonRaphsonStartingPoint[{p_, q_, r_}] := Module[{x1=0, x2=1,f1,f2,xm,fm},
f1 = r + x1 (q + (p + x1) x1);
While[(f2 = r + x2 (q + (p + x2) x2)) <= 0,
x1 = x2; f1 = f2; x2 = 2 x2];
Do[xm = (x1 + x2)/2; fm = r + xm (q + (p + xm) xm);
If[fm <= 0, f1 = fm; x1 = xm, f2 = fm; x2 = xm], {i, 2}];
(f2 x2 - f1 x1)/(f2 - f1)
];
NewtonRaphsonIterate[{p_, q_, r_}, x0_Real] :=
FixedPoint[
Function[x, x - (r + x (q + (p + x) x))/(q + (2 p + 3 x) x)], x0]
In[285]:=
NewtonRaphson[p_, q_, r_] :=
NewtonRaphsonIterate[{p, q, r}, NewtonRaphsonStartingPoint[{p, q, r}]]
In[286]:= NewtonRaphson[-2.52111798, -71.424692, -129.51520]
Out[286]= 10.499
Let's look at the iterations you get from each.
With $f(x) = x - e^{-x}$, you get
$$F(x) = x - \frac{f(x)}{f'(x)} = x - \frac{x-e^{-x}}{1+e^{-x}} = \frac{x + xe^{-x} - x + e^{-x}}{1+e^{-x}} = \frac{x+1}{e^x+1}.$$
That's a nice function, defined on all of $\mathbb{R}$, gets you close to the solution real fast when you start at a non-negative $x_0$, but is not so good for negative $x$, then it approaches the fixed point slowly at first.
With $g(x) = xe^x - 1$, you get
$$G(x) = x - \frac{g(x)}{g'(x)} = x - \frac{xe^x-1}{(x+1)e^x} = \frac{x^2e^x+1}{(x+1)e^x} = \frac{x^2+e^{-x}}{x+1}.$$
That function has a pole in $-1$, and $G(x) < -1$ for $x < -1$, so doesn't reach the fixed point from there. For large positive $x$, it doesn't approach the fixed point fast (basically, it's $x \mapsto \frac{x}{x+1}\cdot x$ there).
So looking at the global behaviour, your choice behaves much better.
Now, for the local behaviour near the fixed point, the Newton iteration roughly behaves like
$$\alpha + \delta \mapsto \alpha + \frac{f''(\alpha)}{2f'(\alpha)}\delta^2$$
when $f'$ doesn't vanish in the zero of $f$, as is the case here, so let's look at the corresponding quotient for the two candidates.
$$\begin{align}
\frac{g''(\alpha)}{2g'(\alpha)} &= \frac{(\alpha+2)e^\alpha}{2(\alpha+1)e^\alpha} = \frac{\alpha + 2}{2(\alpha+1)} = \frac12 + \frac{1}{2(\alpha+1)}\\
\frac{f''(\alpha)}{2f'(\alpha)} &= \frac{-e^{-\alpha}}{2(1+e^{-\alpha})} = \frac{-\alpha}{2(1+\alpha)} = \frac{1}{2(\alpha+1)} - \frac12
\end{align}$$
using $e^{-\alpha} = \alpha$ for the latter.
So $G$ approaches the fixed point from above, while $F$ approaches it from below ($f''(\alpha) < 0 < f'(\alpha)$), and the factor of the square has smaller absolute value for $F$, that means the convergence of $F$ is faster near the fixed point $\alpha$ (but that doesn't matter much against the quadratic convergence, nevertheless).
So, altogether, yours was the better choice globally (converges to the solution from all starting points), and locally (converges faster near the fixed point).
Best Answer
The formulation of tolerance:
$$ |\text{guess} - c/\text{guess}| > \epsilon \cdot \text{guess} $$
is attractive for two reasons besides the relative precision that it imposes as a termination criterion. Bear in mind that the above is the condition to continue iteration. If the condition is too loose, iteration will terminate prematurely, but if too tight, iteration may never terminate.
First: If guess is the current (possibly initial) approximation, note that when guess is above the square root, $c$/guess will fall below the square root (in exact arithmetic). Conversely if guess is below the square root, then $c$/guess will lie above it. For this reason the true square root should be between these, and the difference of those two computed values should overestimate the actual error (but only by a constant factor $\lt 2$ as convergence approaches).
Second: The Newton iteration can be expressed:
$$ \text{guess} \gets (\text{guess} + c/\text{guess})/2 $$
So we haven't wasted a division operation by computing $c/$*guess* if the outcome of the test is to continue the iteration. This result will be used directly in forming the updated approximation to square root.
Now we come to the application of relative precision in deciding to continue or terminate iteration. Any computation of a positive square root is liable to contain a rounding error resulting from normal floating point operations. The absolute size of this error depends (because of the nature of the floating point format) on the size of the value being computed.
If you take the square root of $16664444$, the result is slightly over $4000$ and the rounding error will accordingly be several orders of magnitude greater than the square root of $1E-50$, which is $1E-25$. So if you impose a test for the absolute value of the error, as proposed with:
$$ |\text{guess} - c/\text{guess}| > \epsilon $$
then the same value of $\epsilon$ that does a reasonable job for $16664444$ will be too loose for $1E-50$, and a value that is right for $1E-50$ will be generally impossible to meet for arguments as large as $16664444$.
Therefore testing the relative value of the error as shown at top is the way to balance accuracy over a wide range of floating point arguments. Note that this test is equivalent (in exact arithmetic) to:
$$ \frac{|\text{guess} - c/\text{guess}|}{\text{guess}} > \epsilon $$
(which should make apparent we are comparing relative error in guess to $\epsilon$), but the form of the test trades off a division for a multiplication in the form shown at top. This would typically be computationally faster, if it mattered (and it might since we are doing the check once for each iteration).