Why is $ab$ likely to have more divisors than $(a-b)(a+b)$

divisibilityelementary-number-theorylimitsnumber theoryprime numbers

Consider the two numbers $ab$ and $(a-b)(a+b), \gcd(a,b) = 1, 1 \le b < a$. On an average, which of these two numbers has more distinct prime factors? All the prime factors of $a$ and $b$ divide $ab$ and similarly all the prime factors of $a-b$ and $a+b$ divide $(a-b)(a+b)$. So one number does not seem to have an obvious advantage over the other. However if we look at the data than we see that $ab$ dominates.

Let $f(x)$ be the average number of distinct prime factors in all such $ab, a \le x$ and
$g(x)$ be the average number of distinct prime factors in all such $(a-b)(a+b), a \le x$.

Update: Experimental data for the first $6.1 \times 10^{9}$ pairs of $(a,b)$ shows that $f(x) – g(x) \sim 0.30318$. Instead of distinct prime factors, if we count the number of divisors than $f(x) – g(x) \sim 0.848$.

Question: Why is $ab$ likely to have more distinct prime factors or divisors than $(a-b)(a+b)$ and what is the limiting value of $f(x) – g(x)$?

Best Answer

On one hand, \begin{align*} \sum_{a,b\le x} \omega(ab) &= \sum_{a,b\le x} \sum_{\substack{p\le x \\ p\mid ab}} 1 = \sum_{p\le x} \sum_{\substack{a,b\le x \\ p\mid ab}} 1 \\ &= \sum_{p\le x} \biggl( \sum_{\substack{a,b\le x \\ p\mid a}} 1 + \sum_{\substack{a,b\le x \\ p\mid b}} 1 - \sum_{\substack{a,b\le x \\ p\mid a,\, p\mid b}} 1 \biggr) \\ &= \sum_{p\le x} \biggl( \bigl( \tfrac xp+O(1) \bigr)(x+O(1)) + (x+O(1))\bigl( \tfrac xp+O(1) \bigr) - \bigl( \tfrac xp+O(1) \bigr)^2 \biggr) \\ &= 2x^2 \sum_{p\le x} \tfrac1p - x^2 \sum_{p\le x} \tfrac1{p^2} + O\biggl( x \sum_{p\le x} \bigl( 1+\tfrac1p \bigr) \bigg) = 2x^2 \sum_{p\le x} \tfrac1p - x^2 \sum_p \tfrac1{p^2} + o(x^2). \end{align*} On the other hand, $p\mid(a+b)$ and $p\mid(a-b)$ simultaneously if and only if either $p\mid a$ and $p\mid b$, or $p=2$ and $a$ and $b$ are both odd. Therefore \begin{align*} \sum_{a,b\le x} \omega\bigl( (a+b)(a-b) \bigr) &= \sum_{a,b\le x} \sum_{\substack{p\le x \\ p\mid (a+b)(a-b)}} 1 = \sum_{p\le x} \sum_{\substack{a,b\le x \\ p\mid (a+b)(a-b)}} 1 \\ &= \sum_{p\le x} \biggl( \sum_{\substack{a,b\le x \\ p\mid (a+b)}} 1 + \sum_{\substack{a,b\le x \\ p\mid (a-b)}} 1 - \sum_{\substack{a,b\le x \\ p\mid (a+b),\, p\mid (a-b)}} 1 \biggr) \\ &= \sum_{p\le x} \biggl( \sum_{a\le x} \sum_{\substack{b\le x \\ b\equiv-a\,(\mathop{\rm mod}\,p)}} 1 + \sum_{a\le x} \sum_{\substack{b\le x \\ b\equiv a\,(\mathop{\rm mod}\,p)}} 1 - \sum_{\substack{a,b\le x \\ p\mid a,\, p\mid b}} 1 \biggr) \\ &\qquad{}- \sum_{\substack{a,b\le x \\ a,b \text{ both odd}}} 1 \\ &= \sum_{p\le x} \biggl( (x+O(1))\bigl( \tfrac xp+O(1) \bigr) + (x+O(1))\bigl( \tfrac xp+O(1) \bigr) - \bigl( \tfrac xp+O(1) \bigr)^2 \biggr) \\ &\qquad{}- \bigl( \tfrac x2+O(1) \bigr)^2 \\ &= 2x^2 \sum_{p\le x} \tfrac1p - x^2 \sum_p \tfrac1{p^2} - \tfrac{x^2}4 + o(x^2). \end{align*} From these two asymptotic formulas it follows that the difference of the two sums is asymptotic to $\frac{x^2}4$, so that the average difference in the number of distinct prime factors is asymptotically $\frac14$.

Heuristically (in hindsight), the difference is entirely caused by the prime $2$: since $a$ and $b$ are even or odd independently, there is a $\frac34$ chance that $p=2$ will contribute to $\omega(ab)$; but since $a+b$ and $a-b$ are both even or both odd, there is only a $\frac12$ chance that $p=2$ will contribute to $\omega\bigl( (a+b)(a-b) \bigr)$.