Prime Counting Function – Is $1 = \sum_{n=1}^{\infty} \frac{\pi(p_n^2)-n+2}{p_n^3-p_n}$?

analytic-number-theoryfractalsnt.number-theoryprime numbers

Is
$$1 = \sum_{n=1}^{\infty} \frac{\pi(p_n^2)-n+2}{p_n^3-p_n},$$
where $\pi$ denotes the prime counting function and $p_n$ denotes the $n$-th prime?

Context:

This question came out as a result in trying to answer this previous question by counting the size of the blocks on the main diagonal of the fractal:
How to define a fractal from the lexicographic sorting on the prime factorization of natural numbers?

I have changed the sorting a little bit, without changing the appearance of the fractal:

fractal_on_prime_numbers

The sorting is based on the prime factorization, but now we allow the primes to be counted with multiplicity:

If $m = p_1 p_2 \cdots p_r, n = q_1 q_2 \cdots q_s$ with $p_1 \le p_2 \le \cdots \le p_r, p_i$ prime, and similarily $q_1 \le q_2 \le \cdots \le q_s$ then:

$$m \le n :\iff [p_1,p_2,\cdots,p_r] \le [q_1,q_2,\cdots,q_s]$$

where the right hand side denotes the lexicographic ordering of the two lists.

By sorting the numbers $1,\cdots,n$ and careful counting, we find that the size of the blocks on the main diagonal of the fractal are given by the numbers $a_q(n)$, which are defined as follows, and seem to obey the following formula:

Let $a_q(n):=a_{n,q}$ denote the number of $1 \le k \le n$ such that the minimal prime divisor of $k$ is $q$. Let $\DeclareMathOperator\np{np}\np(n)$ denote the smallest prime $p\ge n+1$.
$$
a_{n, q} =
\begin{cases}
0 & \text{if } 0 \leq n \leq q – 1 \\
1 & \text{if } q \leq n \leq q^2 – 1 \\
2 & \text{if } q^2 \leq n \leq q \cdot \text{np}(q) – 1 \\
\pi(\np(\lfloor n/q \rfloor)) – \pi(q) + 1 & \text{if } q \cdot \np(q) \leq n \leq q^3 – 1 \\
a_{n – q^3 + q, q} + \pi(q^2) – \pi(q) + 2 & \text{otherwise.}
\end{cases}
$$

Using this (half proven) formula, we can guess what the limit is:

$$\lim_{n\rightarrow \infty} \frac{a_p(n)}{n} = \frac{\pi(p^2)-\pi(p)+2}{p^3-p}.$$

From this guessing we deduce the question above.

I am asking if there is some other proof of the formula above, or if the formula is wrong?

Edit:
As pointed out by fedja and GHfromMO in the comments, the reason why the formula above is not correct, is that we have instead:

$$\lim_{n \rightarrow \infty} \frac{a_p(n)}{n} = \frac{1}{p}\prod_{q<p,q\text{ prime}}(1-\frac{1}{q})$$

Since we have (because the set $\{1,\cdots,n\}$ is divided by the primes $p \le n$ in pairwise disjoint sets of size $a_p(n)$):

$$1 = \lim_{n \rightarrow \infty}(\frac{1}{n}+\sum_{p \le n} \frac{a_p(n)}{n})$$

we obtain the following formula instead of the one in the title of the question:

$$1 = \sum_{n=1}^{\infty} \frac{1}{p_n}\prod_{q<p_n,q\text{ prime}}(1-\frac{1}{q})$$

which can also be written as:

$$1 = \frac{1}{2}+(1-\frac{1}{2})\cdot$$
$$(\frac{1}{3}+(1-\frac{1}{3})\cdot$$
$$(\frac{1}{5}+(1-\frac{1}{5})\cdot$$
$$(\frac{1}{7}+(1-\frac{1}{7})\cdot$$
$$(+\ldots+\cdot$$
$$(\frac{1}{p_n}+(1-\frac{1}{p_n})(\frac{1}{p_{n+1}}+(1-\frac{1}{p_{n+1}})(+\ldots)$$

Best Answer

The sum is less than $1$. First of all, Mathematica and SAGE independently tell me that $$\sum_{n=1}^{10000} \frac{\pi(p_n^2)-n+2}{p_n^3-p_n}=0.950344\dots.\tag{1}\label{1}$$ We estimate the tail sum by applying estimates from the classical paper Rosser–Schoenfeld: Approximate formulas for some functions of prime numbers, Illinois J. Math. 6 (1962), 64–94. By (3.2) in this paper, $$\frac{\pi(p_n^2)}{p_n^2}<\frac{1}{2\log p_n}+\frac{3}{8\log^2 p_n}.$$ Using also (3.10) in the paper, that is the bound $$p_n>n(\log n+\log\log n-3/2),$$ we can infer that $$\frac{\pi(p_n^2)}{p_n^2}<\frac{p_n-p_n^{-1}}{2n\log^2 n},\qquad n>10^4,$$ with an extra factor of $0.8272$ on the right-hand side for $n\leq 10^8$. Therefore, $$\sum_{n=10001}^\infty\frac{\pi(p_n^2)-n+2}{p_n^3-p_n} <\sum_{n=10001}^\infty\frac{\pi(p_n^2)}{p_n^3-p_n}<\sum_{n=10001}^{10^8}\frac{0.8272}{2n\log^2 n}+\sum_{n>10^8}\frac{1}{2n\log^2 n}.$$ It follows that $$\sum_{n=10001}^\infty\frac{\pi(p_n^2)-n+2}{p_n^3-p_n} <\int_{10^4}^{10^8}\frac{0.8272}{2x\log^2 x}dx+\int_{10^8}^\infty\frac{1}{2x\log^2 x}dx=0.049596\dots.\tag{2}\label{2}$$ From \eqref{1} and \eqref{2}, we see that the OP's sum is less than $0.999942$.