If you are using IEEE 754 compliant floating point numbers, it may be that square root operations are faster than you might suppose, as the square root operation is directly supported in the standard and any compliant hardware is guaranteed to round correctly (unlike for sine and cosine) and is often using a hardware implementation of the Newton-Raphson method Alijah described.
That said, the algorithm you linked to only uses the fractional part of the square root to calculate pixel opacity, and consequently the final opacity value ranges only from 0 to 255. Because of the small range, floating point numbers may be overkill and a fixed-point integer representation might work better. If the range is truly only a byte and the maximum size of your radii aren't too large, you can use a look-up table with fixed-point input to skip the expensive square root calculation. A 16-bit fixed-point number would give you a 64KB look-up table, which isn't too bad.
You might also be able to avoid a division and square root operation for calculating the $45^\circ$ value (ffd
in the algorithm) by using the Fast Inverse Square Root hack.
Now for the question of whether there is a method to calculate the fractional part of a square root knowing only the integer part, there is one approach that iteratively calculates a square root and minimizes divisions: Continued Fractions. The simplest approach I know of for what you are wanting to do (fast convergence) is detailed here and works as follows:
$a_0 = x - \lfloor\sqrt x\rfloor^2$
$b_0 = 2\lfloor\sqrt x\rfloor$
$a_n = a_0b_{n-1}$
$b_n = b_0b_{n-1} + a_{n-1}$
which gives you quadratically better and better approximations of the fractional part of $\sqrt x$, and you divide $\frac{a_n}{b_n}$ when you've done enough iterations to ensure accuracy. If you are ultimately needing only byte sized opacity values, it should only take 3 or 4 iterations or so, and we save the division until the end, which is the only significant difference between this and the Newton-Raphson method other than the fact that it gives you the fractional part directly.
If you really want to pursue incrementally calculating variables as far as it can go, you can use Gosper's continued fraction algorithms (see especially the section on square roots) and calculate all the variables involved as continued fractions one term at a time. This allows you to avoid square root calculations and divisions other than bit shifts, as well as abort as soon as you know the correct pixel opacity (or whatever else you're calculating) without wasting time calculating digits you don't need, but it involves a serious overhaul to the algorithm you linked, and if I went into detail this answer would turn into a book.
So essentially, if you have memory to spare and the max length of your radii isn't huge and your output size is small, go with a look-up table with fixed-point numbers. If you want simplicity of implementation, go with floating point or fixed-point numbers. If you want to absolutely avoid square root calculation without a look-up table go with Newton-Raphson or the continued fraction variant on it. If you want to absolutely minimize wasted computation at the expense of some up-front overhead, go with Gosper's continued fraction algorithms.
In order to show that $99^{99}$ has $198$ digits, we need to show that
$$10^{197}\le99^{99}\lt10^{198}$$
The second inequality is obvious, since $99^{99}\lt100^{99}=10^{198}$. So it remains to prove the first inequality.
Toward this end, recall that
$$\left(1+{1\over n}\right)^n\lt e$$
for any $n\ge1$. We'll take for granted also the (generous!) inequality $e\lt10$. Thus
$$\left(100\over99\right)^{99}=\left(1+{1\over99}\right)^{99}\lt e\lt10$$
so $100^{99}\lt10\cdot99^{99}$. Since $100^{99}=10^{198}=10\cdot10^{197}$, the inequality $10^{197}\lt99^{99}$ now follows.
Best Answer
For the solution of $x!=n$, @Gary proposed a superb approximation (have a look here)
$$x = \frac{{\log \left( {\frac{n}{{\sqrt {2\pi } }}} \right)}}{{W\left( {\frac{1}{e}\log \left( {\frac{n}{{\sqrt {2\pi } }}} \right)} \right)}} - \frac{1}{2}$$ where $W(.)$ is Lambert function.
Let us rewrite it as $$x=e \frac t{W(t)}-\frac{1}{2} \qquad \text{with} \qquad t=\frac 1e\log \left( {\frac{n}{{\sqrt {2\pi } }}} \right)$$
Now, for large values of $t$, use what is given in the Wikipedia page $$W(t) \sim L_1-L_2+\frac{L_2}{L_1}+\frac{L_2(L_2-2)}{2L_1^2}+\frac{L_2(2L_2^2-9L_2+6)}{6L_1^3}+\cdots$$ where $L_1=\log(t)$ and $L_2=\log(L_1)$
Using my $50+$ years old non-programmable pocket calculator for $$n=123456789876543212345678987654321234567898765432123456789$$ $$t=47.1756\qquad L_1=3.85388\qquad L_2=1.34908$$ Using only the first three terms, then $$W(t)\sim3.85388-1.34908+\frac{1.34908}{3.85388}=2.85485$$ $$x\sim e \frac{47.1756}{2.85485}-\frac 12=44.4188$$ while, as a real, the solution is $45.0083$.
Speaking about integers, for sure, you need to use $\lceil x \rceil$ and the results are the same. $$45!=119622220865480194561963161495657715064383733760000000000$$ $$46!=5502622159812088949850305428800254892961651752960000000000$$ Using the gamma function $$44.4188!=13053540092571206188366309387719398631004867852125601792$$ $$45.0083!=123456789876545254267360504092794809714819815437028556800$$