[Math] Number of vectors so that no two subset sums are equal

additive-combinatoricsco.combinatoricssumsets

Consider all $10$-tuple vectors each element of which is either $1$ or $0$. It is very easy to select a set $v_1,\dots,v_{10}= S$ of $10$ such vectors so that no two distinct subsets of vectors $S_1 \subset S$ and $S_2 \subset S$ have the same sum. Here $\sum_{v \in S_i} v$ assumes simple element-wise vector addition where element addition takes place over $\mathbb{R}$. For example, if we take the vectors that are the columns of the identity matrix as $S$ this will do.

What is the maximum number of vectors one can choose that has this
property?

I previously asked this question on MSE . An explicit construction of $17$ vectors was given by Oleg567 using computer search and an upper bound of $45$ was given by jpvee simply using the observation that $\sum_{k=1}^{17} {46 \choose k} > (17+1)^{10}$ implies that $46$ vectors is impossible.


Lower bound improved to $18$ by Oleg567. Upper bound still stuck at $45$ although it seems implausible the true value is far from the current lower bound.


Upper bound of $36$ given by Seva.


Conjecture Feb 24, 2014. I conjecture the optimal solution size is $\lfloor \frac{1}{2} (n+1) \log_2(n+1) \rfloor$. For $n=2\dots 15$ this is $2, 4, 5, 7, 9, 12, 14, 16, 19, 21, 24, 26, 29, 32$.


New lower bound of $19$ by Brendan McKay.


New upper bound of $30$ by Brendan McKay.

Best Answer

Inspired by Seva's probabilistic method, I will show how to improve the upper bound to 30. Imagine we have a 0-1 matrix $A$ of 31 rows and 10 columns. I will show that there are two different subsets of the rows that have the same sum.

Define the 10-dimensional random variable $X=(X_1,\ldots,X_{10})$ whose value is the sum of a random subset of the rows. This is the same random variable defined by Seva. Seva proceeded by showing that each $X_j$ is concentrated on a few values. I will improve the bound by considering the components in pairs.

Write $X$ as $(Y_{12},Y_{34},Y_{56},Y_{78},Y_{9,10})$, where $Y_{12}=(X_1,X_2)$, $Y_{34}=(X_3,X_4)$, and so forth. The distribution of $Y_{12}$ depends only on the parameters $w_{01},w_{10},w_{11}$, which are respectively the number of times that 01, 10, 11 occur in the first two columns of $A$. (And so 00 occurs $31-w_{01}-w_{10}-w_{11}$ times.) The probability generating function for $Y_{12}$ is $$ F_{12}(x_1,x_2) = 2^{-w_{01}-w_{10}-w_{11}}(1+x_1)^{w_{10}} (1+x_2)^{w_{01}} (1+x_1x_2)^{w_{11}}. $$ (The coefficient of $x_1^ax_2^b$ is the probability that $Y_{12}=(a,b)$.) By trying all possible $w_{01},w_{10},w_{11}$, we find that in each case there is some set $K_{12}$ of 55 values such that $$\textrm{Prob}( Y_{12}\notin K_{12}) \le p = \frac{300387}{2097152} \approx 0.1432.$$ This calculation is easy for a computer: just expand $F_{12}$ and sum the largest 55 coefficients. One worst case is $w_{01}=w_{10}=10, w_{11}=11$.

By symmetry, there are also sets $K_{34},\ldots,K_{9,10}$ of size 55 containing at least the fraction $1-p$ of $Y_{34},\ldots,Y_{9,10}$, respectively. By the union bound, at least the fraction $1-5p$ of $Y$ lies in $$K = K_{12}\times \cdots\times K_{9,10}.$$ However, $(1-5p)2^{31} > |K| = 55^5$, Therefore, there are two values the same.

It should be possible to do better by grouping the columns even more. I think a non-trivial but plausible computation could handle the 10 columns in two groups of 5.

Related Question