Conjecture: All but 21 non-square integers are the sum of a square and a prime

asymptoticsgoldbachs-conjecturenumber theoryprime numberssums-of-squares

Update on 6/19/2020. This discussion led to deeper and deeper results on the topic. The last findings are described in my new post (including my two answers), here.

I came up with the following conjecture. All non-square integers $z$ can be represented as $z=x^2 + y$ where $x$ is an integer and $y$ is a prime. The exceptions are

z = 10, 34, 58, 85, 91, 130, 214, 226, 370, 526, 706, 730, 771, 1255, 1351, 1414, 1906, 2986, 3676, 9634, 21679.

Note that this is deeper than Goldbach conjecture (all even numbers are the sum of two primes) because squares are far rarer than primes. Also, few numbers are the sum of two squares, such numbers (sums of two squares) are far more abundant than primes, but their natural density is also zero. But all numbers are the sum of four squares. Surprisingly, all integers can be represented as $z = \lfloor x^c \rfloor +
\lfloor y^c \rfloor$
where $x, y$ are positive integers and $c < \log_{22} 63$ is a positive constant; but this fails at $c = \log_{22} 63$ as $z=73$ becomes an exception. See section 1 in this article for details; this is also a conjecture.

Question: Can you verify if my conjecture is true up to some very large $z$? I tested it only for $0\leq z < 750000$.

Heuristics behind this conjecture

This is by no mean a proof, but rather, I explain here why I think it could be true. Let denote as $r(z)$ the number of solutions to $x^2 +y \leq z$ where $x, y$ are integers and $y$ is prime. For a fixed large $z$, we want to count the number of integer couples $(x, w)$ below the curve $z=x^2+ w\log w$, with $x, w\geq 0$, in order to approximate $r(z)$. The choice of $w \log w$ is a direct consequence of the prime number theorem, replacing primes by their approximation, for large primes. That count $r(z)$ grows faster than $O(z)$. The derivative $dr(z)/dz$ thus grows faster than $O(1)$, and it shows how the number of solutions to $z=x^2+y$ grows on average, faster than $O(1)$ as $z$ increases.

More details about the heuristic approach

Essentially, we are trying to count the number of blue points under the red curve in the plot below (in this example, $z=100$). The equation for the curve is $w \log w = z-x^2$, and $z$ is assumed to be fixed.

enter image description here

The equation can be re-written as $w = (z-x^2)/W(z-x^2)$ where $W$ is the Lambert function, which behaves asymptotically like the $\log$ function. Thus the number of points below the red curve is asymptotically (for large values of $z$) equal to

$$r(z) \sim \int_0^\sqrt{z} \frac{z-x^2}{W(z-x^2)}dz \sim
\int_0^\sqrt{z} \frac{z-x^2}{\log(z-x^2)}dz = \frac{1}{2}\int_0^z \frac{u}{\sqrt{z-u}\cdot\log u}du.$$

Let us denote as $\phi(z)$ the function defined by the rightmost integral. We have $r(z) \sim \phi(z)$. I computed the exact values of $r(z)$ and $\phi(z)$ for various small and large $z$, and clearly, $r(z) \rightarrow C \cdot \phi(z)$, but I am not sure if $C=1$. See WolframAlpha computations here.

The number of solutions to $z=x^2+y$ (with $y$ prime) is thus, on average, as $z$ gets larger and larger, asymptotically equivalent to $d\phi(z) / dz$. Below is a table featuring $r(z)$ and $\phi(z)$.

enter image description here

Good asymptotic approximations for very large $z$ are

$$\phi(z)\approx\frac{2}{3}\cdot \frac{z^{3/2}}{\log z} \mbox{ and }
\frac{d\phi(z)}{dz}\approx \frac{\sqrt{z}}{\log z}.$$

The last result is compatible with the one posted in the answer by Dietrich Burde, confirming that the approach I used here is sound. Note that the same methodology could be applied to sums of squares or sums of primes or any sums of integers. It is pretty generic.

Final comment

The number of solutions to $z = x^2 + y$ (with $y$ prime, $x$ an integer) is equal to $r(z)-r(z-1)$. In all cases, $r(z)$ grows slowly (polynomial at most) and thus $r(z)-r(z-1) \sim dr(z)/dz$. We could get deeper results with second- and third-order approximations in all the asymptotic results used in this article, rather than just first-order approximations.

Below is a chart featuring the distribution for the number of solutions to $z=x^2+y$ [that is, the distribution of $r(z)-r(z-1)$] for $700000\leq z < 740000$.
enter image description here

For instance, there are $441$ different $z$'s between $z = 700000$ and $z = 740000$ for which $z=x^2 + y$ has exactly $50$ solutions. Below is the same chart, but for $100000\leq z < 140000$. The two distributions are strikingly similar in shap2.
enter image description here

Finally, among the first 750,000 $z$'s, we have:

  • $z = 78754$ is the last one to admit only one decomposition as $z =
    x^2+y$
  • $z = 101794$ is the last one to admit exactly two decompositions
  • $z = 339634$ is the last one to admit exactly three decompositions
  • $z = 438166$ is the last one to admit exactly four decompositions
  • $z = 383839$ is the last one to admit exactly five decompositions

The $z$'s that admit only one decomposition are listed below. I searched for this sequence to see if it had been discovered, but could not find any reference.

z = 2, 5, 8, 13, 15, 22, 24, 26, 31, 37, 40, 46, 50, 55, 61, 70, 74, 76, 82, 94, 99, 106, 115, 120, 127, 133, 136, 142, 145, 154, 159, 166, 170, 178, 184, 202, 205, 219, 221, 235, 246, 250, 253, 265, 268, 274, 295, 298, 301, 310, 316, 319, 325, 328, 334, 340, 346, 379, 391, 394, 399, 412, 424, 436, 439, 442, 445, 469, 490, 505, 511, 559, 562, 571, 574, 586, 589, 610, 616, 646, 694, 781, 793, 799, 829, 834, 835, 874, 914, 922, 946, 949, 970, 979, 991, 994, 1030, 1045, 1066, 1090, 1105, 1164, 1204, 1219, 1243, 1324, 1354, 1366, 1384, 1411, 1450, 1501, 1549, 1555, 1642, 1717, 1726, 1765, 1786, 1810, 1885, 1981, 1990, 2041, 2059, 2074, 2146, 2167, 2245, 2266, 2284, 2344, 2410, 2416, 2479, 2650, 2806, 2821, 2854, 2899, 2926, 3004, 3094, 3151, 3166, 3184, 3319, 3418, 3502, 3811, 3859, 3865, 3964, 3991, 4216, 4222, 4279, 4330, 4414, 4504, 4510, 4645, 4654, 4711, 4930, 5482, 5506, 5545, 5986, 6031, 6049, 6274, 6439, 7009, 7081, 7441, 7549, 7954, 8086, 8584, 8824, 9214, 9571, 10165, 10774, 11509, 11806, 13834, 15106, 15334, 15565, 16081, 16186, 23851, 31879, 33205, 44536, 78754

Best Answer

This is Hardy and Littlewood's Conjecture $H$. It says that this sequence $a(n)= 10,34,58,85,\ldots$ is finite and that the number of representations of $n$ as the sum of a prime and a square is asymptotically $$ \frac{\sqrt{n}}{\log (n)} \cdot \prod_{p > 2}\left( 1 - \frac{(n / p)}{p - 1}\right)$$

where $(n / p)$ is the Legendre symbol.

References: Mikawa, Narkiewicz, Wang

The conjecture is tested up to $10^{11}$ so far, i.e., it is known that $a(22) > 10^{11}$, if it exists.

Related Question