Sums Modulo p – Distribution Analysis

additive-combinatoricsco.combinatoricsnt.number-theorypr.probability

Fix a finite set of integers $S$ and a prime number $p$. Let $(a_1, a_2, \dotsc, a_n)$, $(b_1, b_2, b_3, \dotsc, b_n)$ be two sequences of integers where the numbers $a_i$ and $b_i$ are chosen uniformly from the set $S$ (In the case I care about S = {-2,-1,1,2 }). I would like to understand what is the probability that $$\sum_{i_1 < i_2 < \dotsb < i_k } a_{i_1}a_{i_2}\dots a_{i_k} = 0 \mod p$$ an the probability that $$\sum_{i_1 \leq j_1 < i_2 \leq j_2 < \dotsb < i_k \leq j_k} a_{i_1}b_{j_1}\dots a_{j_k}b_{j_k} = 0 \mod p.$$

I am interested in the asymptotic probabilities as $n$ increases and $k$ is fixed (or $k$ grows much slower than $n$) and expect some type of equidistribution as $n$. One particular case is Question about estimating random symmetric sums modulo p, where one particular sum is reduced to the linear case. Can this trick be for the general case?

Any suggestion, trick or vaguely related information related to this is highly appreciated.

P.S. Can the second type of sum (the one involving $a_i$, $b_i$) be somehow related to a sum of the first type?, the ordering makes things much more tricky and I am wondering whether there is a procedure to reduce this sum to the first type. (Also the second sum even though not so natural is the one that I really need to understand).

Thanks!

Best Answer

Using the Newton identities, one can (in the high characteristic regime $p>k$) express the elementary symmetric polynomial $\sum_{i_1 < \dots < i_k} a_{i_1} \dots a_{i_k}$ in terms of the moments $\sum_{i=1}^k a_i^j$ for $j=1,\dots,k$. The question then boils down to the equidistribution of $\sum_{i=1}^k (a_i^j)_{j=1}^k$ in ${\mathbf F}_p^k$. There is however an obstruction to equidistribution because the points $(a^j)_{j=1}^k$ for $a=-2,-1,1,2$ do not span the entirety of ${\mathbf F}_p^k$ once $k \geq 4$, due to the existence of non-trivial polynomials $P$ of degree $4$ or higher that vanish at $-2,-1,1,2$. For instance because $P(x) = (x+2)(x+1)(x-1)(x-2) = x^4 - 5 x^2 + 4$ vanishes at these points, the vector $(m_j)_{j=1}^k := \sum_{i=1}^k (a_i^j)_{j=1}^k$ is surely constrained to the hyperplane $m_4 - 5 m_2 + 4k = 0$. This is going to create some distortions to the probability that $\sum_{i_1 < \dots < i_k} a_{i_1} \dots a_{i_k}$ vanishes mod $p$ that can be explicitly calculated for each $p,k$, but the formula is going to be messy (these errors will be of lower order in the limit $p \to \infty$ holding $k$ fixed though, by the Lang-Weil estimates).

(To put it another way, one can use the Newton identities to express the elementary symmetric polynomial as some polynoimal combination of the frequencies $d_{-2}, d_{-1}, d_1, d_2$ that count how often the random sequence $a_1,\dots,a_k$ attains each of its permitted values $-2, -1, 1, 2$. To count the probability that this polynomial vanishes mod $p$, one has to count the points in some variety over ${\bf F}_p$ which in general is a task for the Lang-Weil estimate.)

Your second sum seems to be expressible in terms of a symmetric polynomial in a suitable matrix algebra. If one had $i_k < j_k$ in place of the constraint $i_k \leq j_k$ then one just needs to take the order $2k$ elementary symmetric polynomial of the matrices $\begin{pmatrix} 0 & b_i \\ a_i & 0 \end{pmatrix}$ and extract the top left coefficient. With $i_k \leq j_k$ the situation is more complicated but I expect there is still some sort of matrix representation. However being non-abelian I doubt there is a reduction to the abelian elementary symmetric polynomial considered earlier, and given how complicated that formula already was I'm afraid it is not going to be fun to try to control this sum (except possibly in the asymptotic regime where $p$ is somewhat large and $k$ goes to infinity; actually the nonabelian case might be substantially more "mixing" than the abelian one and one could conceivably get better asymptotics by using some of the theory of expansion of Cayley graphs, but this looks like a lot of work...).