[Math] Constructing prime numbers

nt.number-theoryprime numbers

The classical proof of the infiniteness of prime numbers is to take the $k$ first prime numbers $p_1,\ldots,p_k$, then to form
$$n_k:=1+p_1\cdots p_k.$$
Then $n_k$ has a prime factor, which is none of the $p_j$.

Notice that one could form instead the number
$$n_{k,I}:=\prod_{i\in I}p_i+\prod_{j\not\in I}p_j$$
where $I$ is any subset of $[1,\ldots,k]$ (we have $n_k=n_{k,\emptyset}$), or even
$$m_{k,I}:=\prod_{i\in I}p_i-\prod_{j\not\in I}p_j,$$
provided this number is not $\pm1$.

This suggests the following construction of prime numbers $q_k$ by induction. Start with $k=1$ and $q_1=2$. At step $k$, form the numbers
$$M_{k,I}:=\left|\prod_{i\in I}q_i-\prod_{j\not\in I}q_j\right|.$$
Then $q_{k+1}$ is the least of the prime factors of all these numbers (that is, of their product as $I$ runs over the subsets of $[1,\ldots,k]$).

Is it true that $q_k=p_k$ for all $k$ ? If not, does the sequence covers all the prime numbers ? If not, how fast does this sequence increases as $k\rightarrow+\infty$ ?

One might enlarge the construction by taking in account the numbers $M_{k,I}$ and
$$N_{k,I}:=\prod_{i\in I}q_i+\prod_{j\not\in I}q_j,$$
but they are way larger.

Best Answer

You can obtain a complete heuristic picture of what is going, by supposing that quadratic residues (as suggested by Laurent) are the only likely obstruction to getting the next prime. More precisely, if you let $Q_n = \prod_{k \le n} q_k$ be the product of all of the primes discovered so far, then you can let $p_{n+1} = p$ be the first remaining prime such that $Q_n$ is a quadratic residue mod $p$, assuming Denis' first proposal. Or, assuming Denis' second proposal, you can let $p$ be the first remaining prime such that either $Q_n$ or $-Q_n$ is a quadratic residue mod $p$. (Thus it will simply be the next prime if that prime happens to be 3 mod 4.) Or, assuming that you only use the sum formula, you would only check $-Q_n$. Then experimentally, $q_{n+1} = p_{n+1}$ in all three cases.

This describes the sequence that Charles obtained, which I also obtained (at least the first half of it) with a Sage program.

qseq = [2]
for z in range(16):
    attain = set(); Q = prod(qseq)
    prediction = [p for p in list(primes(500)) if not p in qseq and
        (legendre_symbol(-1,p) == -1 or legendre_symbol(Q,p) == 1)][0]

    for S in Combinations(qseq):
        A = prod(S); attain.update([abs(A-Q/A),A+Q/A])

    best = 10000
    attain.discard(1)
    best = min([min(factor(m))[0] for m in attain])
    print prediction,':',qseq,'->',best
    qseq.append(best)

Assuming both the sum and the difference are used, the validity of this prediction reduces to the following: If $p$ is this next prime, then the question is whether $1$ appears in the set product of the subsets $\{q_k,1/q_k\}$ and $\{\pm 1\}$ in the multiplicative group $(\mathbb{Z}/p)^*$. This is a question in additive set theory if you take a formal logarithm and pass to the additive group $\mathbb{Z}/(p-1)$. A reasonable conjecture is that you are convolving a lot of subsets that don't come close to lying in a subgroup of $\mathbb{Z}/(p-1)$, so you can expect $0$ to appear in the sum. The heuristic has such a large margin that it may be possible with existing methods to prove that it is always so.

The other part of the question is how much you deviate from seeing the primes in order. Again, using a randomness heuristic in number theory, the "probability" that a prime $p$ will be skipped at a given stage is 1/2 (assuming that it is 1 mod 4 in the second case). So you expect the first skipped prime to eventually be included with a "half life" of one round. Thus you expect the sequence to only ever be barely ahead of the list of primes, and you expect $q_n$ to be the $n$th prime infinitely often.

Related Question