Combinatorics – How to Merge N Companies into One: Bell or Catalan Numbers?

combinatoricselementary-set-theoryset-partition

There's a famous interview question variously credited to Microsoft, Google and Yahoo:

Suppose you have given N companies, and we want to eventually merge
them into one big company. How many ways are there to merge them?

Assuming you can merge as many companies as you like in a single step, I thought this boils down to "find the number of partitions of a set with N elements", in which case the answer is the Bell number $B_{n}$. This can be computed with this handy recursion cribbed shamelessly from Wikipedia:

$B_{n+1}=\sum_{k=0}^{n}{{n \choose k}B_k}$

$1, 1, 2, 5, 15, 52, 203…$

And you have to substract one since you're already starting from one of the possible sets: $B_{2}=2$, but there's only one way to combine A and B into AB.

However, there are a lot of sources on the net which claim that the correct solution is the Catalan number:

$C_n = \frac{1}{n+1}{2n\choose n} = \frac{(2n)!}{(n+1)!\,n!} = \prod\limits_{k=2}^{n}\frac{n+k}{k} \qquad\mbox{ for }n\ge 0$

$1, 1, 2, 5, 14, 42, 132…$

Which is correct, and why? Or are they both correct depending on the assumptions you make about the somewhat vague problem statement?

Best Answer

My first interpretation is that a way of merging the $n$ companies corresponds to a sequence $\langle P_k:k=0,\dots m\rangle$ such that:

  1. $P_0=\{1,\dots,n\}$;
  2. $P_{k+1}$ is a partition of $P_k$ for $k=0,\dots,m-1$;
  3. $|P_{k+1}|<|P_k|$ for $k=0,\dots,m-1$; and
  4. $|P_m|=1$.

These sequences evidently correspond to rooted trees with $n$ labelled leaves. The immediate correspondence between the partition sequences and the trees allows the latter to have internal vertices with only one child, so that each $P_k$ corresponds to a level of the tree, and requires that the leaves all be on the same level, but we can clearly collapse the non-branching branches and instead count rooted trees with $n$ labelled leaves in which every internal vertex has at least two children.

Let $M(n)$ be the number of such partition sequences or trees. Clearly $M(1)=M(2)=1$, and $M(3)=\binom32+\binom33=4$.

Edit2: In fact the correct numbers are $1,1,4,26,236,2752$, OEIS A000311, the number of phylogenetic trees with $n$ vertices. $M(n)$ is the sum over all partitions $\sum_{i=1}^mn_ip_i=n$, where the $p_i$ are distinct positive integers, of the terms $$\frac{n!}{\prod_{i=1}^mp_i!^{n_i}n_i!}\cdot\prod_{i=1}^mM(p_i)^{n_i}=\frac{n!}{n_1!n_2!\dots n_m!}\prod_{i=1}^m\left(\frac{M(p_i)}{p_i!}\right)^{n_i}\;;$$ verifying this is a matter of fairly straightforward counting. OEIS gives the much nicer recurrence $$M(n+1)=(n+2)M(n)+2\sum_{k=2}^{n-1}\binom{n}kM(k)M(n-k+1)\;,$$ among several others.

Restricting to binary mergers yields the sequence $1,1,3,15,105$, which appears to be A001147, $(2n-3)!!$. The OEIS entry says that $a_n$ is the number of distinct products of $n=1$ variables with commutative, non-associative multiplication, and those would appear to correspond to the binary merger trees.

Related Question