I am stuck with the sieving stage of Quadratic Sieve algorithm. I've read lots of papers to this point but I can't find any guidlines how to choose sieving interval or how sieving is actually done because what I've seen so far is 'we pick a sieving interval' or 'then sieve around this'.
This is what I've understood and have done (following Contini's paper):
-
Say, for N = 90283, I compute bound $B = e^{(\frac{1}{2} + o(1))(\sqrt{\ln n \ln \ln n})} = 44$, where I take $o(1) = 0.22$ (just a random guess, I assume). What is the best way to pick good $o(1)$?
-
Then I take $C = \lceil \sqrt{N}\rceil = 301$
-
Then using Atkin Sieve I get a list of primes $<B$:
$\{2, 3, 5, 7, 11, 13, 17, 19, 23, 29, 31, 37, 41, 43\}$
-
Then by computing Jabobi (Legendre) symbol for each value in the primes list I pick the first quadratic nonresidues to get factor base:
$\{2, 3, 7, 17, 23, 29, 37, 41\}$
-
Then using Tonelli-Shanks algorithm I compute modular roots $\pm t$, where $t$ is a solution to $t^2 \equiv N \mod{p}$ with $p$ a prime from factor base.
-
Then I get two arrays of solutions $sol1 = t – C \mod p$ and $sol2 = – t – C \mod p$, with p's from factor base. and also get one arrays of logarithms for p's $l_p = \ln p$ rounded up to some precision, say two decimal digits. What is a good precision?
sol1 $= \{0, 0, 2, 13, 11, 26, 10, 28\} $
sol2 $= \{0, 1, 5, 14, 8, 10, 17, 26\} $
$l_p$ $= \{0.69, 1.1, 1.95, 2.83, 3.14, 3.37, 3.61, 3.71\}$
Now as far as I understand I have to initialize 'sieving_array' initialized to 0's, say, to size $ = 60$ elements. Also, how should I choose size of the sieving interval? Is there any formula similar to the bound?
Then I have to add values from $l_p$ to sieving array to positions $sol1[j] + i * $factor_base[j] and $sol1[j] + i *$ factor_base[j], where $ 0 \le i \le $ size and $ 0 \le j \le |$factor_base$|$. And for prime $p=2$ add $l_p$ only to positions with sol1. So I get this list (rounded to two decimal digits):
$\{1.79, 1.1, 2.64, 1.1, 1.79, 1.95, 1.79, 1.1, 3.83, 3.05, 8.77, 3.14, 3.74, 3.93, 3.52, 1.1, 3.74, 3.61, 1.79, 3.05, 0.69, 1.1, 1.79, 1.95, 1.79, 1.1, 9.72, 1.1, 5.5, 0.0, 6.57, 7.07, 0.69, 3.05, 4.93, 0.0, 1.79, 3.05, 0.69, 4.47, 3.74, 0.0, 1.79, 1.1, 2.64, 1.1, 1.79, 8.39, 4.62, 1.1, 0.69, 3.05, 1.79, 0.0, 10.49, 4.47, 0.69, 4.24, 3.74, 0.0\}$
Now, I have to look through values in the list, and if some of them are at least $\log(2x \sqrt N)$ this means polynomial $g(x)$ can be factored over factor base.
My main question is: where does this $x$ come from? I have this list of values, but what do I do with it? If I have an element in this list, for example, $1.79$ – with what $x$ do I have to compare it? How can I pick a good error term? Is $x$ position of $1.79$? If it is, what value indicated by this $1.79$ is factorable over factor base?
Best Answer
You've covered a lot of material in your question but I will focus on your main sticking point: determining a cutoff for use after sieving. There are two issues that make choosing the cutoff problematic:
If you choose a cutoff larger than the square root of the largest number, none of the candidates will pass the test and sieving will be pointless. Also, since you are adding a crude estimate of the log of the factor, need also need some slack to account for, say, a number composed of factors that all round down. Finally, if prime $p$ is in your factor base and you don't sieve for prime powers $p^a$, you need even more lattitude to catch numbers that are not squarefree.
From a theoretical point of view, the sieving is just an optimization to avoid trial dividing all the numbers in the interval. So the choice of the cutoff is to a large extent an implementation decision:
The ideal value of the cutoff therefore depends on exactly how fast sieving is and how fast trial dividing is. The solution is to try a variety of values and see what works best in your particular implementation.