Let $a,b,n,q$ all be odd numbers, and let $p_N$ be the $N$th odd prime, throughout.
Formula for the number $H(a,b,n) = \# \{a \lt x \lt b : x^2 – 1 = 0 \pmod n, x \text{ even}\}$. This is true since the pairs are in one-to-one correspondence with the averages of the two pair components.
Therefore, by inclusion-exclusion, we get the number $G(a,b) = \# \{ a \lt x \lt b : x^2 – 1 = 0 \pmod q$ for some $q \in Q\}$, where $Q$ is a finite set of primes. We have:
$$
G(a,b) = \sum_{u \ \mid \ \prod Q \\ u \neq 1} (-1)^{\Omega(u)}H(a,b,u)
$$
where $\Omega$ is prime Omega from number theory or $\Omega(p_1^{k_1} \cdots p_r^{k_r}) = \sum_{i = 1}^r k_i$. But in this case, since $Q$ consists of distinct primes we have $\Omega(u) = \omega(u)$ so you can use either.
And therefore, the number $F(a,b) = \# \{ a \lt x \lt b: x^2 – 1 \text{ is a unit } \pmod q, \forall q \in Q\}$ is given by:
$$
F(a,b) = \dfrac{b – a}{2} + G(a,b)
$$
Since for $a,b$ odd there are $\dfrac{b – a}{2}$ total pairs of consecutive odd numbers $o, o+2 \in [a,b]$.
Now, if we let $a = p_{N}, b = p_{N}^2$, then no prime $q\lt p_N$ divides the end points and so $H(p_N, p_{N+1}^2, u)$ has the surprisingly simple form (for $u$ comprised of primes $q \lt p_N$):
$$
H(a,b,u) = 2 \left( \lfloor\dfrac{b}{u}\rfloor – \lfloor\dfrac{b}{2u} \rfloor -\lfloor\dfrac{a-1}{u}\rfloor + \lfloor\dfrac{a – 1}{2u} \rfloor\right)
$$
Let $Q = $ the first $N-1$ odd primes. Then $F(p_N, p_N^2)$ should count the number of twin prime averages in the interval $(p_N, p_N^2)$. However, here is some code that performs this inclusion-exclusion formula:
#import matplotlib.pyplot as plt
from sympy import primepi, isprime, primerange, divisors, primeomega
from math import floor, sqrt
N = 1000
def H(a,b,u):
res = 2*(floor(b/u) - floor(b/(2*u)) - floor((a-1)/u) + floor((a-1)/(2*u)))
return res
def inclusion_exclusion(F, Q):
prodQ = 1
for q in Q:
prodQ *= q
sum = 0
for u in divisors(prodQ):
if u != 1:
sum += (-1)**primeomega(u) * F(u)
return sum
primes = list(primerange(3, N))
for n in range(1, len(primes) + 1):
Q = primes[:n]
p_n = primes[n]
formula0 = (p_n**2 - p_n)//2 + inclusion_exclusion(lambda u: H(p_n, p_n**2, u), Q)
#formula = (p_N**2 - p_N) - (primepi(p_N**2) - primepi(sqrt(p_N - 1)) - 1)
actual_count = 0
for x in range(p_n, p_n**2):
if isprime(x - 1) and isprime(x + 1):
actual_count += 1
print(formula0 - actual_count)
However, the code is printing a growing error value. It has to be all $0$'s for the formula to be correct.
I cannot figure out where my math is going wrong. I thought the code was simple enough so I've included it here.
So, I need some help with the math of it. Did I correctly apply inclusion-exclusion, for example.
Here is a fully worked example to help us out:
$$
a = 7, b = 7^2, Q = \{3,5\} \\
H(a,b,3) = 2([49/3] -[49/6] – [6/3] + [6/6]) = 14\\
H(a,b,5) = 2([49/5] -[49/10] – [6/5] +[6/10]) = 8 \\
H(a,b,15) = 2([49/15] – [49/30] – [6/15] + [6/30]) = 4 \\
G(a,b) = H(a,b,3) + H(a,b,5) – H(a,b,15) = 18 \\
F(a,b) = \dfrac{b – a}{2} + G(a,b) = 3 \\
$$
This formula is not working, as clearly the number should be $\#\{12,18,30,42\} = 4$. So it is off by $1$ in this case. It is close, but no cigar. I can't see where the problem is in my derivation.
I think a problem can be seen here:
$$
H(5,25,3) = 2([25/3] – [25/6] – [4/3] + [4/6]) = 2(8 – 4 – 1 + 0) = 6
$$
.......
_____ _____ ________ ________ ........
0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25
_____ _______ ________ _______
...... ........
____'s count the odd pairs divided by $3$. …..'s count those that are not.
True, there are 6 ____'s between 5 and and 25, but 24 is not a twin prime average.
So the formula should be:
$$
H(a, b, u) = 2\left(\lfloor \dfrac{b}{u}\rfloor – \lfloor\dfrac{b}{2u}\rfloor – \lfloor \dfrac{a-1}{u}\rfloor + \lfloor\dfrac{a-1}{2u} \rfloor\right) + \begin{cases} 0 \text{ if }, p_N^2 = 1 \pmod u \\ 1 \text{ otherwise}\end{cases}
$$
whenever $a = p_N, b = p_N^2$, $u$ is comprised of the first $N-1$ odd primes.
Best Answer
I looked at the overcount for the case $a = 7, b = 49$ and it turns out that $H(a,b,15)$ doesn't count odd pairs $(25,27)$ nor $(33,35)$ which explains why the count is off by two when I run the up-to-date code.
The reason is because:
$$ (x^2 = 1 \pmod 3) \wedge (x^2 = 1 \pmod 5) \iff \\ (x = \pm 1 \pmod 3) \wedge(x = \pm 1 \pmod 5) \iff \\ (x^2 = 1 \pmod {15}) \\ \vee (x = -1 \pmod {3} \wedge x = 1 \pmod {5}) \\\vee (x = 1 \pmod {3} \wedge x = -1 \pmod {5}) $$
$H(a,b,15)$ only counts the first part, the parts where $x = -1 \pmod {3}$ and $x = 1 \pmod {5}$ or vise versa (swapping $3, 5$) are not counted and those are precisely the pairs $(25,27)$ and $(33,35)$.
The good news is that those $\vee$'s are exclusive or's so once we come up with a formula for the rest we can add/subtract it in. The bad news is I'm not sure how to count those, for general $q_1\cdots q_n$. It might be easier, knowing each $q_i$ is distinct.