Find a prime $p$ from only knowing $x^2\bmod p$ for suitably chosen (small and few) value of $x$

big-listcontest-mathnumber theory

This question is related to the CodeChef problem Guess The Prime of the now closed July 2019 competition. It seems to have caused some stir-up recently during and even shortly after the contest. However, I suppose that a treatment of the mathematical aspects of the problem is appreciated. Additional answers with different approaches or handling different aspects are welcome.


Player $A$ picks a prime number $p$ below some given bound $N$. Player $B$ has the task of finding $p$, for which purpose $B$ may only name an integer $x$ and $A$ will reply with the value $x^2\bmod p$. (Note that $a\bmod b$ is the remainder $\in\{0,\ldots, b-1\}$ not a coset modulo $b\Bbb Z$; so for example $17\bmod 3=17\bmod 5 = 2$ even though $2+3\Bbb Z\ne 2+5\Bbb Z$)

Q1: How many queries are needed for $B$ to succeed with guarantee?

Q2: How many queries are needed if we require $x<N$?

Q3: How large must we allow $x$ to be so that $B$ can succeed with a single query?

Q4: What are the smallest possible queries in order to succeed with two queries?

… and many more questions can be thought of in this context (the second question is the generalized essence of the original CodeChef problem). Even more variations are obtained by considering other (finite, of course) sets of primes instead of all primes below some threshold.

Best Answer

Q1. We have the primes $$p_1<p_2<\ldots <p_n $$ and want to pick $x$ such that $x^2\bmod p_i$ differes for all $i$. For odd $p$, there are $\frac{p+1}2$ quadratic remainders modulo $p$, and we always have $\frac{p_i+1}2>i-1$. Therefore, it is always possible to pick a quadratic remainder modulo $p_i$ that differs from any quadratice remainders modulo $p_j$, $j<i$, picked before. Therefore, it is possible to pick $a_1,\ldots, a_n$ such that $a_i^2\bmod p_i\ne a_j^2\bmod p_j$ for $i\ne j$. The Chinese Remainder Theorem guarantees that there exists $x$ with $0\le x< p_1p_2\cdots p_n-1$ such that $x\equiv a_i\pmod {p_i}$ for all $i$, and hence $x^2\bmod p_i$ determines $i$ uniquely, as desired. Unfortunately, $p_1p_2\cdots p_n$ is an undesirably large bound.

We may ask: How many numbers $x\in\{0,\ldots, p_1\cdots p_n-1\}$ can solve the problem? We can pick $x\bmod p_1$ arbitrarily in $p_1$ ways. This excludes at most two choices of $x\bmod p_2$, so here we still have $p_2-2$ choices. For $x\bmod p_3$, at most four choices are "forbidden", hence we still have $p_3-4$ choices. And so on. We conclude that there are at least $$p_1(p_2-2)(p_3-4)(p_4-6)\cdots (p_n-2n+2) $$ good choices for $x$. Incidentally, if the first primes are $2,3,5$, we fare better by picking $x\equiv 0\pmod {p_1}$ and then there are less forbidden values at each stage, leading to at least $$\tag1 (p_2-1)(p_3-3)(p_4-5)\cdots (p_n-2n+3)$$ good choices for $x$. While this is a substantial number, we can heuristically expect the smallest good $x$ to be in the order of magnitude $p_1\cdots p_n$ divided by $(1)$, i.e., we may expect $$ x\approx p_1\left(1+\frac1{p_2-1}\right)\left(1+\frac3{p_3-3}\right)\left(1+\frac5{p_4-5}\right)\cdots\left(1+\frac{2n-3}{p_n-2n+3}\right)$$ to be achievable (thereby obtaining a heuristic estimate as answer to Q3).. If the $p_i$ are the $25$ primes below $100$, this estimate is $\approx 1.8\cdot 10^9$, for the $168$ primes below $1000$, it is $\approx 2.6\cdot 10^{39}$, so from a practical point of view still unfeasibly large. These heuristic estimates make no claim about the actually smallest possible $x$ for these cases. In fact, by brute force, one finds that that $x=545408$ works for the 25 primes $<100$ and we need to increase $x$ to $62947$ to also work for $p=101$ (and in fact up to $p=137$).

However, in other cases, this estimate may be more promising, namely if we deal with only two or three primes (and not the smallest): Now we can expect some $x\approx p_1\left(1+\frac1{p_2-1}\right)\left(1+\frac3{p_3-3}\right)<p_1+5$ to work (which is of course $\ll p_1p_2$, not to mention $\ll p_1p_2p_3$). In fact, for $n=2$, it is certain that $x=p_1$ works (but is probably not the smallest). It is quite often the case that $x=\lceil\sqrt{p_{n-1}}\rceil$ is worth looking into. The proportion $\ge \left(1-\frac1{p_2}\right)\left(1-\frac1{p_3}\right)$ is typically close enough to $1$ to make it a promising strategy to simply try values for $x$ one by one starting from $\lceil\sqrt{p_{n-1}}\rceil$ upwards and check if the $x^2\bmod {p_i}$ are distinct, rather than following the Chinese Remainder Theorem.


Q2. The estimates made under Q1 make it implausible that requiring $x<N$ allows a solution with a single query. However, we shall see that two queries always suffice. To also deal with Q4 along the way, let us pick the first query $x_1$ such that $x_1^2<N$. Then all $p_i>x_1^2$ will lead to the same reply $r_1=x_1^2\bmod p_i=x_1^2$, and in that case we have to pick $x_2$ such that we can distinguish between all primes between $x_1^2$ and $N$.

So if A's first reply $r_1$ is $x_1^2$, we may pick $x_2=\lceil\sqrt{p_{n-1}}\rceil$ (anything smaller cannot tell $p_{n-1}$ from $p_n$) and will either have $r_2=x_2^2$, in which case $p_n$ must be the answer, or $r_2<x_2^2$ and we know that $p$ divides $x_2^2-r_2$ and is $>r_2$ (and $>x_1^2$). A simple strategy to guarantee that only one such $p$ exists for each valid $r_2$, is to ensure that $x_2^2-r_2$ cannot have two prime divisors $>x_1^2$ (and so our guess will simply be the largest prime divisor of $x_2^2-r_2$). For this, it certainly suffices to pick $$\tag2x_1>\sqrt[4]{p_{n-1}}.$$

But what if $r_2< x_1^2$? Then $p$ must be a prime divisor of $x_1^2-r_2$ and must be $>r_2$. Clearly, the product of theses candidate primes is $<x_1^2$ and so even the conservative CRT estimate tells us that there exists a "good" $x_2$ below $x_1^2$, so if we take $(2)$ into account, $x_2<\sqrt {p_{n-1}}<\sqrt N$. From a practical perspective, we can make the search for $x_2$ simpler by making $x_1$ divisible by many small primes without making it significantly larger. For manageable sizes of $N$, this may help us guarantee that $x_1^2-r_1$ has only very few and not too small prime factors (except when $r_1=0$), thereby making the search for $x_2$ simpler.

For example, with $N=10^9$, we could pick $x_1=210=2\cdot 3\cdot 5\cdot 7$ (the first primorial $>\sqrt[4]{10^9}$). The case $r_1=0$ is handled by $x_2=3$, as $9\bmod2=1$, $9\bmod3=0$, $9\bmod 5=4$, $9\bmod 7=2$ are distinct. The case $r_1=44100$ is handled by $x_2=\lceil\sqrt{10^9}\rceil = 31623$. All other $r_1$ lead to at most three prime factors as already $11\cdot 13\cdot 17\cdot 19>44100$ and are therefore readily dealt with. In fact, for $r_1=1$, we can take $x_2=5$ to tell $11$, $19$, and $247$ apart (please check the factorization of $x_1^2-r_1$ for this case and the cases below); for $r_1=4$, we can pick $x_2=4$ to tell $13$ and $53$ apart; for $r_1=2$ we can take $x_2=5$ to tell $17$ and $1297$ apart; for $r_1=9$ we can take $x_2=5$ to tell $23$ and $71$ apart; for $r_1=20$, it is already clear that $29$ is the solution; for all other remainders $r_1$, there are at most two candidate primes (as $31\cdot 37\cdot 41>44100$) and hence querying the smallest prime factor of $x_1^2-r_1$ exceeding $r_1$ will work.

Related Question