[Math] Find the sum of all primes smaller than a big number

computational complexitynumber theory

I need to write a program that calculates the sum of all primes smaller than a given number $N$
($10^{10} \leq N \leq 10^{14} $).

Obviously, the program should run in a reasonable time, so $O(N)$ is not good enough.
I think I should find the sum of all the composite numbers smaller than $N$ and subtract it from $1+2+…+N$, but I'm trying that for a long time with no progress.

Best Answer

Lucy_Hedgehog's post on Project Euler forum about problem 10 (visible after logging in):

Here is a solution that is more efficient than the sieve of Eratosthenes. It is derived from similar algorithms for counting primes. The advantage is that there is no need to find all the primes to find their sum.

The main idea is as follows: Let $S(v,m)$ be the sum of integers in the range $2..v$ that remain after sieving with all primes smaller or equal than m. That is $S(v,m)$ is the sum of integers up to $v$ that are either prime or the product of primes larger than $m$.

$S(v, p)$ is equal to $S(v, p-1)$ if $p$ is not prime or $v$ is smaller than $p \cdot p$.
Otherwise ($p$ prime, $p \cdot p \leq v$) $S(v,p)$ can be computed from $S(v,p-1)$ by finding the sum of integers that are removed while sieving with $p$. An integer is removed in this step if it is the product of $p$ with another integer that has no divisor smaller than $p$. This can be expressed as

$S(v,p)=S(v,p−1)−p(S(\frac{v}{p},p−1)−S(p−1,p−1)).$

Dynamic programming can be used to implement this. It is sufficient to compute $S(v,p)$ for all positive integers $v$ that are representable as $\lfloor \frac{n}{k} \rfloor$ for some integer $k$ and all $p \leq \sqrt{v}$.

Python:
def P10(n):
    r = int(n**0.5)
    assert r*r <= n and (r+1)**2 > n
    V = [n//i for i in range(1,r+1)]
    V += list(range(V[-1]-1,0,-1))
    S = {i:i*(i+1)//2-1 for i in V}
    for p in range(2,r+1):
        if S[p] > S[p-1]:  # p is prime
            sp = S[p-1]  # sum of primes smaller than p
            p2 = p*p
            for v in V:
                if v < p2: break
                S[v] -= p*(S[v//p] - sp)
    return S[n]
The complexity of this algorithm is about $O(n^{0.75})$ and needs 9 ms to find the solution. Computing the sum of primes up to different bounds $n$ gives:

n = $2 \cdot 10^7$: 12272577818052 0.04 s
n = $2 \cdot 10^8$: 1075207199997334 0.2 s
n = $2 \cdot 10^9$: 95673602693282040 1 s
n = $2 \cdot 10^{10}$: 8617752113620426559 6.2 s
n = $2 \cdot 10^{11}$: 783964147695858014236 34 s
n = $2 \cdot 10^{12}$: 71904055278788602481894 3 min

I also have a C++ version of this algorithm. This one solves the problem in 700μs. It needs 10 hours to compute the sum of primes up to $10^{17}$ which is 129408626276669278966252031311350.

It is also possible to improve the complexity of the algorithm to $O(n^{2/3})$, but the code would be more complex.

I've checked with pen and Python - according to me, it works as described:

>>> P10(2*10**10)
8617752113620426559 in 8 s

>>> P10(2*10**11)
783964147695858014236 in 45 s

>>> P10(2*10**12)
71904055278788602481894 in 3 min 19 s

Related Solutions

[Math] Prove that every integer $n\geq 7$ can be expressed as a sum of distinct primes.

We shall inductively prove a stronger form, namely that every positive integer $n \ge 7$ can be written as the sum of distinct primes such that the largest is at most $\max(11,n-7)$. It turns out that strengthening makes the induction work!

Take $n \ge 28$.

Let $m = \lceil \frac{n-6}{2} \rceil= \lfloor\frac{n-5}{2} \rfloor$.

Let $p$ be a prime such that $m+1 \le p \le 2m-1$ [by Bertrand's postulate].

Then $\frac{n-5}{2} \le p \le n-7$.

By the induction to be established $n-p$ can be written as a sum of distinct primes such that the largest is at most $\max(11,n-p-7)$, which is less than $p$ because $p \ge \frac{28-5}{2} > 11$ and $2p \ge n-5 > n-7$.

Thus $n$ can be written as a sum of distinct primes such that the largest is at most $\max(7,n-7)$, and the induction holds as long as the claim is true for every $n$ from $7$ to $27$.

7 = 5+2
8 = 5+3
9 = 7+2
10 = 5+3+2
11 = 11
12 = 7+5
13 = 11+2
14 = 7+5+2
15 = 7+5+3
16 = 11+5
17 = 7+5+3+2
18 = 11+7
19 = 11+5+3
20 = 11+7+2
21 = 11+7+3
22 = 13+7+2
23 = 13+7+3
24 = 13+11
25 = 13+7+5
26 = 13+11+2
27 = 13+11+3

Numbers which are sums of all smaller primes

Let $\Psi(n)$ denote the sum of the primes less than $n$. We contend that $n>5$ implies $\Psi(n)>n$. That is clearly stronger than the desired result.

It is easy to confirm that this is true for the first few $n$. We then proceed by induction.

Suppose $N$ is the least counterexample. Then, in particular, the claim must be true for $\Big \lfloor \frac N2\Big \rfloor$ (we may assume that we have checked far enough so that $\frac N2>6$). But then $$\Psi\left(\Big \lfloor \frac N2\Big \rfloor\right)>\frac N2$$

By Bertrand there is a prime $p$ between $\frac N2$ and $N$ so that we have $$\Psi(N)≥\Psi\left(\Big \lfloor \frac N2\Big \rfloor\right)+p>N$$ and we are done.

Best Answer

Related Solutions

[Math] Prove that every integer $n\geq 7$ can be expressed as a sum of distinct primes.

Numbers which are sums of all smaller primes

Related Question