Why are there more prime factors of the form $4k-1$ than of the form $4k+1$

analytic-number-theoryelementary-number-theorylimitsnumber theoryprime numbers

The density of primes of the form $4k-1$ and $4k+1$ is equal although slight discrepancy exists in the exact numbers also known as Chebyshev bias. All odd prime factors of a number will be either of the form $4k-1$ or $4k+1$. If we take all natural numbers $n \le x$ and look at the prime factors of each of these numbers, the density of prime factors of $4k-1$ significantly higher than that of the form $4k+1$, i.e. the bias is much stronger than Chebyshev bias. Experimental data shows that the difference between their densities is a constant.

Let $a(n)$ = no. of distinct primes factors of $n$ which are of the form $4k-1$ and $b(n)$ = no. of distinct primes factors of $n$ which are of the form $4k+1$.

Data at every checkpoint from $x = 10^6$ to $x = 10^9$ shows a consistent trend that

$$
\frac{1}{x}\sum_{n \le x} [a(n) – b(n)] \approx 0.83498
$$

At a more granular level, we consider only odd numbers then the above constant is $\approx 0.33498$ and if we consider only even numbers then it is $\approx 1.33498$.

Question: What is the source of this bias and is there a closed form of

$$
\lim_{x \to \infty}\frac{1}{x} \sum_{n \le x} [a(n) – b(n)]
$$

Best Answer

This is not a complete and rigorous answer, but I think it is a good start to an explanation.

Up to $x$, you can squeeze in more numbers with $3$ as a factor than you can numbers with $5$ as a factor. Just from $3$ and $5$, you have a difference of something like $\frac{1}{3}-\frac{1}{5}=0.1\overline{3}$.

Trying to measure this, I believe what you have is:

$$f(x)=\left\lfloor\frac{x}{3}\right\rfloor+\left\lfloor\frac{x}{7}\right\rfloor+\left\lfloor\frac{x}{11}\right\rfloor+\cdots$$ $$g(x)=\left\lfloor\frac{x}{5}\right\rfloor+\left\lfloor\frac{x}{13}\right\rfloor+\left\lfloor\frac{x}{17}\right\rfloor+\cdots$$

So $\frac{f(x)-g(x)}{x}$ is something like:

$$\frac13-\frac15+\frac17+\frac1{11}-\frac1{13}-\frac1{17}+\frac1{19}+\cdots$$

This is converging to something that is suspiciously close your result ($0.3349\ldots$) when only examining odd numbers less than $x$.

Note that your result for looking among even numbers ($1.3349\ldots$) is this number plus $1$. And then the original of your three results ($0.8349\ldots$) is the average of those two numbers. Of course averaging the first two numbers would make sense, but why is the even count 1 more than the odd count?

Are factors of $2$ being properly omitted from your counts for $f(x)$ and $g(x)$? I notice that $\frac12+\frac14+\frac18+\cdots=1$, and if factors of $2$ (with repetition) were included in the count for $f(x)$, it would account for that additional $1$. But I am just speculating.