Set of combinations that contains all unique pairs

combinatorial-designscombinatorics

Say I have a class of M students. Every week I choose N students from the class to do some group assignment. The rest of the class does not participate. I have only one criterion: every student needs to have been in a group with every other student at least once. The number of times each student has been chosen to participate in an assignment is not important. I want to find a sequence of groupings of students that satisfies this constraint in as few weeks as possible.

An example:

Class: {A, B, C, D}

Week 1: {A, B, C}
Week 2: {B, C, D}
Week 3: {A, B, D}

This satisfies the constraint; every student is paired with every other student in at least one of the weeks.

I guess this can be written as a more general question. Combinatorics is not really my day job, so this is my best attempt at a formal description of the question:

Given a set $A$ of size $M$, find the set of sets $B=\{C_1, C_2, \ldots, C_Q\}$ with $|C_m|=N<M$ for all $m \leq Q$ such that every possible subset $D \subset A$ with $|D|=P<N$ is contained in at least one of the sets $C_m$, minimising $Q$. I'm really only interested in the case $P=2$ (so: pairs) if that helps.

EDIT: This question is different from Efficiently partition a set into all possible unique pair combinations because the size of the groups $N$ is not equal to the size of the pairs $P$. Furthermore, overlap/repetition between the sets $C_m \in B$ is allowed and the number of sets $Q$ is not known beforehand (needs to be minimised).

Best Answer

There are $\binom{m}2$ pairs of students, and each project takes care of $\binom{n}2$ pairs. Therefore, at least $$\binom{m}2\Big/\binom{n}2=\frac{m(m-1)}{n(n-1)}$$ days are necessary.

If this ratio is an integer, then a schedule which attains is called a Steiner system $S(2,n,m)$. We know that $S(2,q,q^2+q)$ and $S(2,q+1,q^2+q+1)$ exist for any prime power $q$. Furthermore, $S(2,3,6k+1)$ and $S(2,3,6k+3)$ exist for all $k\ge 1$. See here for more details on what is known about the existence of Steiner systems.

We can get an upper bound on the length of an optimal schedule using the probabilistic method. Suppose you randomly choose a $d$ day schedule. The probability a particular pair of people is not chosen is $\Big(1-\binom{m-2}{n}/\binom{m}n\Big)^d=\Big(1-\frac{(m-n)(m-n-1)}{m(m-1)}\Big)^d$. Therefore, the probability that at least one pair of students is not chosen is at most $$ \binom{m}2\Big(1-\frac{(m-n)(m-n-1)}{m(m-1)}\Big)^d $$ If you choose $d$ large enough so that quantity is less than $1$, then it means there is a positive probability that a random $d$ day schedule works, so that in particular it can be done in $d$ days. Being very rough, if we let $x=\frac{m(m-1)}{(m-n)(m-n-1)}$, then this is $\binom{m}2 (1-\frac1{x})^{x\cdot d/x}\approx \binom{m}2e^{-d/x}$, so this is satisfied when $$d\gtrsim x\log\binom{m}2\approx \frac{m(m-1)}{(m-n)(m-n-1)}\times 2\log m.$$