Minimum number of $k$-partitions of a set of size $n$ to enumerate all $n \choose k$ combinations

combinatoricsdiscrete mathematicsset-partition

Given a set $\mathcal{S}$ of size $n$, let a $k$-partition of $\mathcal{S}$ be a grouping into $k$ disjoint classes $$(S_1 ,S_2,…,S_k)$$

Where the $S_i$ do not necessarily contain the same number of elements.

Let $C(S_1, S_2, … S_k)$ be the set of combinations which can be made by taking exactly one element of each class, (i.e. $C(S_1, S_2, … S_k)$ is the Cartesian Product $S_1 \times S_2 \times …\times S_k$). For a given $(n,k)$, what is the minimum number of k-partitions $N$ such that

$\bigcup\limits_{i=1}^{N} C(S_1^{(i)}, S_2^{(i)}, …,S_3^{(i)}) = $ All Possible Combinations

Example:

for $n=4$ and $k=2$, $\mathcal{S} = \{1,2,3,4\}$ we can choose:

$$ S^{(1)} = \left\{\left\{ 1, 2\right\}, \left\{ 3, 4\right\}\right\}$$
$$ S^{(2)} = \left\{\left\{ 1, 3\right\}, \left\{ 2, 4\right\}\right\}$$

Then $$C_{1} = \left\{\left\{ 1, 3\right\} ,\left\{ 1, 4\right\},\left\{ 2, 3\right\},\left\{ 2, 4\right\}\right\}$$
$$C_{2} = \left\{\left\{ 1, 2\right\} ,\left\{ 1, 4\right\},\left\{ 2, 3\right\},\left\{ 3, 4\right\}\right\}$$

and $C_1 \cup C_2 =$ All possible combinations and $N=2$.

Best Answer

When $k=2$ we have $N = \left\lceil \log_2 n\right\rceil$. We can interpret the problem as covering a complete graph $K_n$ with $n$ bipartite graphs, where the $i^{\text{th}}$ bipartite graph contains all the edges between $S_1^{(i)}$ and $S_2^{(i)}$. This was previously asked on MSE here. The lower bound $N \ge \log_2 n$ is the same as the general one I'll give below. The upper bound is obtained by letting $S_1^{(i)}$ be the set of all elements of $\{1,2,\dots,n\}$ whose binary representation has a $0$ in the $i^{\text{th}}$ position, and letting $S_2^{(i)}$ be the complement. (This takes $\mathcal S = \{1,2,\dots,n\}$.)

For general $k$, we have $N \ge \log_b \frac n{k-1}$ where $b = \frac{k}{k-1}$. To prove this, we induct on $n$. This is true for $n < k$, since at that point the lower bound we get is nonpositive.

No matter how we choose $(S_1^{(1)}, \dots, S_k^{(1)})$, we still have:

  1. Each of the sets $\mathcal S - S_1^{(1)}, \dots, \mathcal S - S_k^{(1)}$ has no $k$-element subsets found in $C(S_1^{(1)}, \dots, S_k^{(1)})$.
  2. One of these sets has size at least $\frac{k-1}{k} \cdot n = \frac n b$, because one of the sets $S_i^{(1)}$ has size at most $n/k$.

So passing to a set $\mathcal S' = \mathcal S - S_i^{(1)}$ with both properties, we see that we've obtained none of the $k$-element subsets of $\mathcal S'$ so far. By the inductive hypothesis, we still need $\log_b \frac{n/b}{k-1} = \log_b \frac{n}{k-1} - 1$ more $k$-partitions to see all the $k$-element subsets of $\mathcal S'$; together with the $k$-partition we already used, we need $\log_b \frac{n}{k-1}$ total, as desired.

This gives us the bound $N \ge \log_2 n$ when $k=2$, but I can't think of a matching construction when $k \ge 3$, so I don't know if the bound is tight in general.


For an upper bound, we can take a random construction. Choose a random $k$-tuple $(S_1, S_2, \dots, S_k)$ of subsets of $\mathcal S$ by adding each element of $\mathcal S$ to a uniformly random $S_i$. If we do this, then the probability that a given $k$-set is contained in $C(S_1, S_2, \dots, S_k)$ is $\frac{k!}{k^k}$.

If we now take $N$ random $k$-tuples of this form, the probability that a given $k$-set is never covered is $(1 - \frac{k!}{k^k})^N$, so the expected number of uncovered $k$-sets is $\binom nk (1 - \frac{k!}{k^k})^N$. All we need to do is choose $N$ so that this expected value is less than $1$, and we can conclude that with positive probability, the random construction covers all $k$-sets.

Because $1 - x \le e^{-x}$ for all $x$, we have $\binom nk (1 - \frac{k!}{k^k})^N \le \binom nk e^{-N k!/k^k}$, so we can choose $N = \frac{k^k}{k!} \ln \binom nk$. For fixed $k$, this gives a logarithmic upper bound (in $n$) to match the logarithmic lower bound - but the lower bound $\log_b \frac{n}{k-1}$ is roughly $k \ln n$, whereas this upper bound is roughly $k e^k \ln n$, so the constants are pretty far apart.

Related Question