Number of Combinations of Items from Sets with Dependencies

algorithmscombinationscombinatoricspermutations

Given a collection of $n$ sets of elements, and choosing exactly 1 element from each set, where $S$ is the size (number of elements) of a set, then the total number $w$ of possible combinations of elements across all sets is simply the product of the sizes of each set:

$$w = \prod_{i=1}^n S_i$$

Simple enough. Now, suppose the sets themselves (not the elements within) are arranged in a specific order, so that we can number them Set 1, Set 2, Set 3, etc. Again choosing exactly 1 element from each set, always choosing in order from Set 1 first, then from Set 2, then Set 3, etc., we still have $w$ possible combinations of elements across all sets. The fact that we always choose from sets in the same order is irrelevant.

Now, suppose we introduce dependencies, called links, between the sets as follows. A whole set can be linked to a specific element in an earlier set, such that the later set only exists (is only chosen from) when the earlier set's linked element was chosen from the earlier set. Example: if there are 3 sets, and if Set 2 is linked to Set 1's second element, then we start by choosing an element from Set 1. If we chose the second element, we choose an element from Set 2, then Set 3, as before. But if we did not choose the second element from Set 1, then we skip Set 2 and move on to choosing an element from Set 3. Combinations of elements that involve skipping over sets that are rendered void by their links are still considered valid combinations; they simply do not contain elements from the skipped sets. It is permissible for a set to link to multiple items and/or across multiple earlier sets; at least 1 of a set's links must be satisfied for the set to exist. It is also permissible for multiple later sets to link to the same earlier value.

Example:

Set 1 = {A,B,C,D}; Set 2 = {A,B,C}; Set 3 = {A,B}
         ^   ^ ^     |                |
         |   | |     |                |
          ---|-|-----                 |
              -+----------------------

S2 linked to S1:A. S3 linked to S1:C and S1:D. The possible combinations of elements across these sets is (with dashes indicating skipped sets):

A,A,-
A,B,-
A,C,-
B,-,-
C,-,A
C,-,B
D,-,A
D,-,B

So there are $8$ possible combinations in this example. I know this from listing them by hand above and counting. My question is: for the general case, how can I calculate the total number of possibilities? I am looking to write an algorithm that takes the sizes of the sets and the the links between them, and calculates the number of possibilities. I want to do it without brute-force enumeration of each combination, and my preference would be to avoid recursion and computational inefficiency. My common use case is large sets (thousands or millions of elements) but only a handful of links. It seems to me that there must be a way to calculate the total number without enumerating them. Any help or tips would be appreciated.

Background info for those interested:

I am designing an algorithm where I am trying to squeeze as much information as I can into as few bits as possible, on a very small scale of a few dozen bits. It works by grouping all data fields (the 'sets' in this question) into a single numeric representation which is the index in the mathematical list of all possible combinations of each field's value (the 'elements' in this question) thereby eliminating any wasted entropy in the binary encoding of each field. However, sometimes data fields are only relevant when other data fields are set to a certain value (the 'links' in this question). I have improved my algorithm to handle these cases and further eliminate wasted entropy, but it hinges on being able to calculate the total number of combinations. This is where I got stuck.

Best Answer

My common use case is large sets (thousands or millions of elements) but only a handful of links.

For me a clustered recursion seems to be a good algorithm. The idea is simple but it is a bit hard for me to explain it clearly. I’ll hope you’ll understand it from the below. An input for a clustered recursion is the following. Assume that we are given sets $S_1,\dots, S_n$. Then for each $i$ and each subset $I$ of $\{i+1,\dots, n\}$ we count a number $(i,I)$ of elements of $S_i$ which linked to exactly members of $I$. Also assume that we are given a subset $Z$ of $\{1,\dots, n\}$ of indices $i$ of sets $S_i$ which do not need links to be listed. Clearly, suffices to consider reduced inputs $(i,I)$ with $I\cap Z\varnothing$, with $(i,I):=\sum \{(i,I’):I\subset I’\subset I\cup Z\}$. For instance, in your example we have $n=3$, $Z=\{1\}$, $(1,\{2\})=1$, $(1,\{3\})=2$, $(1,\varnothing)=1$, $(2,\varnothing)=3$, $(3,\varnothing)=2$, and all other $(i,I)$ are zero. Then using the same arguments as in the question, we can count the total number $N$ of possibilities as follows. For each subset $I$ of $\{2,\dots, n\}$ we recursively calculate the total number $N_I$ of possibilities in which the first element is linked with exactly members of $I$ and then count $N=\sum \{N_I: I\subset \{2,\dots, n\}\}$. For instance, in your example there are

  • $(1,\{2\})=1$ choice to choose an element linked exactly with the set $S_2$. In this case we add $2$ to and remove $1$ from the set $Z$ of indices of sets which do not need links to be listed. This gives us the same problem, but with the smaller number of sets. Since the elements of the set $S_2$ are not linked with other sets, in this case we have only $(2,\varnothing)=3$ choices. So in total we have $N_{\{2\}}=(1,\{2\})(2,\varnothing)=3$.

  • $(1,\{3\})=2$ choices to choose an element linked exactly with the set $S_3$. In each of these cases we add $3$ to and remove $1$ from the set $Z$. This gives us the same problem, but with the smaller number of sets. Since the elements of the set $S_3$ are not linked with other sets, in this case we have only $(3,\varnothing)=2$ choices. So in total we have $N_{\{3\}}=(1,\{3\})(3,\varnothing)=4$.

  • $(1, \varnothing)=1$ choice to choose an element not linked with any set $S_i$. In this case we remove $1$ from the set $Z$, but add to it nothing. So in total we have $N_\varnothing =(1,\varnothing)=1$.

Thus we have $N= N_\varnothing + N_{\{2\}}+ N_{\{3\}}=1+3+4=8$.