[Math] Calculating the number of pairs in a series of combinations

combinatorics

This is a problem related to combination math.

I have x questions I want to ask y people. Each individual will answer z questions.

Assume in this example that:

x = 5 questions
y = 1000 people
z = 3 questions per person.

The questions that each person will be asked will be chosen at random.

Using straightforward combination math I can determine from the above assumptions that there are 10 random combinations of 5 questions when each person is asked 3 questions. Using simple math each of the 10 combinations will be asked of 100 people (1000 people /10 combinations).

What I need to solve for is the number of times that a given pair of questions will be asked together. Doing this by hand I am able to determine that in the 10 combinations a given pair of questions will show up together three times. With 100 people answering each of the 10 combinations of questions, each pair will have been answered together 300 times.

This is fine for the given example where I am able to count the combinations by hand but how can I get the number of pairs when the number of questions (x) and/or number of questions per person changes (z)? I imagine there has to be a formula that will allow me to determine the number of pairs in a series of combinations of any size.

To make it even more complicated I need the formula to also be flexible enough to be able to not only find the number of pairs but to also find the number of triples & quads. E.g. how many times do the same three questions get asked together (in any combination) when I have 20 questions and 1000 people are asked 5 questions each?

Best Answer

Notation: For any non-negative integers $n$ and $r$, with $r\le n$, let $\dbinom{n}{r}$ be the number of ways to choose $r$ objects from $n$. This number is called a binomial coefficient. There are many other notations for the same thing, including $C(n,r)$, $C_r^n$, and others. It turns out that $\dbinom{n}{r}=\dfrac{n!}{r!(n-r)!}$.

Let's solve the simplest generalization of your computations. There are $x$ questions, and each person answers $3$ questions. Then there are $\dbinom{x}{3}$ ways to choose $3$ questions. In general, $\dbinom{x}{3}=\dfrac{x(x-1)(x-2)}{6}$.

We ask how many of these triples of questions contain the specific questions $A$ and $B$. We need to choose a question from the remaining $x-2$ to join $A$ and $B$. There are $\dbinom{x-2}{1}=x-2$ ways to do this. In particular, for $x=5$ we get your $3$.

Let's now generalize a little more, and suppose that $z$ questions are asked of each person. There are $\dbinom{x}{z}$ ways to choose the questions. How many of these $\dbinom{x}{z}$ ways contain the specific questions $A$ and $B$? We need to choose $z-2$ questions from the remaining $x-2$ to join $A$ and $B$. There are $\dbinom{x-2}{z-2}$ ways to do this.

Now let us count triples. So we ask how many of the $\dbinom{x}{z}$ choices contain the specific questions $A$, $B$, and $C$. We then need to choose $z-3$ questions from the remaining $x-3$ to join $A$, $B$, and $C$. There are $\dbinom{x-3}{z-3}$ ways to do this.

For specific quadruples $A$, $B$, $C$, $D$, the same reasoning shows that of the $\dbinom{x}{z}$ choices, there are $\dbinom{x-4}{z-4}$ ways to have a specific quadruple. (Naturally $z$ has to be $\ge 4$.)