[Math] Erik Westzynthius’s cool upper bound argument: update

nt.number-theoryreference-requestsieve-theory

Version 2 of this writeup is
available, and includes a newer and simple upper bound thanks to
MathOverflow 88777 as
well as indirect references to future writeups. Details of further work
will be found in these writeups. GRP 2014.06.04.


In a paper of Erik Westzynthius,

Ueber die Verteilung der Zahlen, die zu den n ersten Primzahlen teilerfremd sind,
Comm. Phys. Math. Soc. Sci. Fenn., Helsingfors (5) 25 (1931), 1-37

I saw the following upper bound argument. Having never
studied sieve theory, I was quite impressed by it. The goal is to bound from above
the quantity max $(q_{i+1} – q_i)$, where the $q_i$ are the positive integers in
increasing order which are relatively prime to $P_n$, the product of the first
n primes. Here is a sketch of the argument.

Let $a$ and $x$ be real parameters, with $x > 0$ . Consider the
integers in the open interval $(a, a+x)$, and call this set $H$. Let us look
at the subsets of $H$ consisting of those integers which are a multiple of the
positive integer $t$; call the size of each such subset $I_t$. Step 1 is to use
inclusion-exclusion to
estimate $I_0$, the number of integers in the interval $(a, a+x)$ which are relatively
prime to $P_n$. (I.e. count integers, throw out multiples of 2, throw out multiples
of 3, add in multiples of (2*3) to compensate, etc.) We get

$I_0 = \sum_{t \in R} [I_t * (-1)^{\mu(t)}] $.

Here $R$ is the set of positive integers which are of the form
(warning: sloppy notation) $\prod_{J \subset n} p_j$,
that is all integers whose prime factorizations have only primes less than
or equal to the nth prime, and those occuring only to at most the first power.
(More succinctly, $R$ is also the set of positive divisors of $P_n$.) For
such a number $t \in R$, $\mu(t)$ is precisely the number of prime factors in $t$,
and $\mu$ is chosen to suggest the Moebius function whose value at such $t$ is
$(-1)^{\mu(t)}$. The equality is exact.

Step 2 is to replace $I_t$ with a linearized approximation plus an error term which
I will call $E(t)$. This substitution gives:

$I_0 = \sum_{t \in R} [ (E(t) + x/t) * (-1)^{\mu(t)} ]$ .

Since the number of multiples of $t$ in the interval $(a, a + x)$ is roughly $x/t$,
the error term $E(t)$ is bounded in absolute value by $1$. Step 3 will rewrite
the RHS and estimate it pessimistically: $E(t) * (-1)^{\mu(t)}$ will be replaced
by $-1$, and the alternating sum of $x/t$ terms can be rewritten as a product
involving terms of the form $(1 – 1/p_i)$, where $p_i$ is the $i$th prime. There are
$2^n$ terms of the form $E(t)$, so one gets:

$I_0 \geq [x * \prod_{1 <= i <= n} (1 – 1/p_i) ] – 2^n = x/Q – 2^n$ .

Here $Q$ is an abbreviation for $1$ divided by the product of the n terms $(1 – 1/p_i)$.
It is roughly log n for large n. Here comes the kicker. Step 4
notes that steps 1 through 3 are essentially independent of $a$, and if $x$ can be
chosen so that $x/Q – 2^n > 0$, then $I_0 > 0$ which means at least one of the $q_i$
is in the interval $(a, a+x)$ when $a > 0$, and such $x$ would be an upper bound for
$q_{i+1} – q_i$ which is independent of $i$. So choose $x = Q * 2^n$ plus epsilon.

I thought it a neat enough argument (especially the kicker)
that I am sharing it here with other non-students of sieve theory. Now to the questions.

1) Is there any work done which improves the upper bound for $q_{i+1} – q_i$?
The answer to this is yes, since in a footnote Westzynthius shows how to
improve the bound to $Q * 2^{n-1}$ by counting odd multiples. So I really want
to know if there are even better bounds out there, done by additional researchers.
I would expect a provable bound to be $Q * 2^g$, where $g$ is something like a
polynomial in log(n), but even having $g$ be n to a fractional power would be something.

2) Is there work done which uses something like the Bonferroni inequalities to
improve the above argument?

3) Did Westzynthius publish any other work (possibly nonmathematical) besides the
paper that includes the argument above?

Motivation: I am considering improvements to this argument which do establish
better upper bounds, and am wondering how to push the exponent from n – o(1)
down to poly(log(n)). Especially, I want to know if I am rediscovering how
to replace n by cn for some $c < 1$, as opposed to discovering how to do it.

Gerhard "Ask Me About System Design" Paseman, 2010.09.03

Best Answer

So it turns out that sieve theory already has some results that can give bounds for this problem. It took me a while to track down a paper that gives an explicit result, but I finally found this one.

Using the Bonferroni inequalities gives you what is called "Brun's pure sieve". According to Wikipedia, Brun's pure sieve gives a bound of the form $n^{b\log\log(n)}$ for some $b$.

According to the result in the paper, using the full power of Brun's sieve (which involves splitting up the primes we are sifting by into buckets of primes based on their sizes, throwing out terms of the inclusion-exclusion rule which have too many primes from the large buckets and then applying an inequality similar to the Bonferroni inequalities), we get that the number of numbers in an interval of length $x$ that are relatively prime to the first $n$ primes is at least

$x(\prod_{p\le p_n}(1-\frac{1}{p}))(1-2\frac{\lambda^{2b}e^{2\lambda}}{1-(\lambda e^{1+\lambda})^2}(1+o(1)))+O \left(p_n^{2b-1+\frac{2}{e^{2\lambda}-1}+\epsilon}\right)$,

where $b$ is an integer we get to choose, $\lambda$ is a positive real number we get to choose, and $\epsilon$ is greater than $0$.

If we plug in $b = 1$ and $\lambda = 0.2533$, we get that for sufficiently large $n$, any interval of size $O(n^{4.032})$ contains a number relatively prime to the first $n$ primes.

You can probably get better bounds with other sieves.

As far as what the best bound really is, I'm going to conjecture that the longest stretch of numbers that are not relatively prime to the first $n$ primes always has length less than $2p_n$. Computer search shows that this is true for $p_n \le 31$, and the worst case example for primes up to $31$ looks like this:

X.X.X.X.X.X.X.X.X.X.X.X.X.X.X.X.X.X.X.X.X.X.X.X.X.X.X.X.X
.X..X..X..X..X..X..X..X..X..X..X..X..X..X..X..X..X..X..X.
...X....X....X....X....X....X....X....X....X....X....X...
X......X......X......X......X......X......X......X......X
......X..........X..........X..........X..........X......
..X............X............X............X............X..
...........X................X................X...........
.........X..................X..................X.........
.....X......................X......................X.....
...........................X............................X
.............................X...........................

where the columns represent numbers, the rows represent primes, and an X means that the number corresponding to the column has been sieved out by the prime corresponding to the row.

Related Question