Algorithm to find the largest intersection of sets

algorithmscombinatoricsdiscrete optimizationelementary-set-theorygraph theory

Edit:

I've tried to be more precise, clear up my examples, and to clarify the problem. Hopefully the problem makes more sense now.


The problem is this:

I have a list of sets $$S_1, S_2,… S_N$$ where each set contains $m$ elements, all drawn from some larger set $A$. The challenge is to find the largest set $$K\subseteq A$$ such that for every subset $k_i\in K$ of size $|k_i|=m$, there exists a matching $S_j == k_i$.

Put another way: we want to find the largest $K$ for which every subset of size $m$ which can be made from the elements of $K$ matches one of the original sets $S_i$. To make this clearer, here are two examples:

Example 1

If we have a small alphabet $$A = (a, b, c, d)$$
with $m=2$ and sets
$$
\begin{aligned}
S_1 &= (a, b)\\
S_2 &= (a, c)\\
S_3 &= (b, c)\\
S_4 &= (c, d)
\end{aligned}
$$

then $K = (a, b, c)$. All combinations of $K$ which are size $m=2$ appear in our list of sets $S_i$ (specifically $S_1, S_2, S_3$ contain every combination, length 2, of the elements of $K$).

Note that $d$ only appears in $S_4$ with $c$. We cannot include it in $K$ because then there would be pairs of elements of $K$ which do not match any $S$, such as $(a, d)$.

Example 2


If $A$ is a small alphabet
$$ A = (a, b, c, d, e, f)$$
and we have that $m$ = 3 and sets
$$
\begin{aligned}
S_1 &= (a, b, c)\\
S_2 &= (b, c, d)\\
S_3 &= (c, d, e)\\
S_4 &= (a, c, f) \\
S_5 &= (b, d, f) \\
S_6 &= (b, c, e)\\
S_7 &= (c, e, f)\\
S_8 &= (b, c, f)\\
S_9 &=(c, d, f)
\end{aligned}
$$

So, in this example, the largest set would be $K = (b, c, d, f)$, here every triple which we can form from $K$ has a corresponding set (these being $S_2, S_5, S_8, S_9)$.

Task


I'm trying to find an algorithm which can solve this task with minimal scaling in both $|A|$ and $m$. The closest related problem is probably this, though mine is meaningfully different.

My current solution, which I'm sure is terrible, does this:

for i = m to |A|
   outer_sets = make combinatoric sets of size i using A
      for outer_set in outer_sets:
          inner_set = make combinatoric sets of size i using outer_set
             for inner_set in inner_sets:
                 check if there exists an S_i == inner_set
             if all checks positive:
                max = i
                K = outer_set

As you can see, very inelegant and with terrible scaling.

Any help much appreciated!

Best Answer

Your example 1 corresponds to maximum clique in a graph with node set $A$ and a link for each $S_j$. Maximum clique is the same as maximum independent set in the graph complement.

Your example 2 corresponds to maximum hyperclique in a 3-uniform hypergraph with node set $A$ and a hyperlink for each $S_j$.

Related Question