Combinatorics – Partitioning a Multiset into Multisets of Fixed Sizes

combinatoricsmultisetsset-partition

Say we have a multiset $S(\mathbf{d}$) where $\mathbf{d}$ is a list of $l$ numbers and the multiplicity of the $i$th element of $S$ is $d_i$. The cardinality $N$ of $S$ is $\sum d_i$.

We want to partition $S$ into $m$ multisets of size $k_i$ respectively, so that $\sum k_i = \sum d_i = N$. How many ways can we do this?

In my mind this is a generalization of the multinomial coefficient $\binom{n}{k_1,k_2,\ldots,k_m}$ representing the number of ways to partition a set of $n=\sum k_i$ objects into $m$ bins of sizes $k_i$, to a sort of number like $\binom{\mathbf{d}}{k_1,k_2,\ldots,k_m}$ or $\binom{\mathbf{d}}{\mathbf{k}}$ representing the number of ways to partition a multiset of $n=\sum k_i = \sum d_i$ into $m$ bins of sizes $k_i$.

There are a few special cases that are simpler to calculate:

  • If $m=1$, then clearly $k_1 = N$ and you're choosing the whole multiset. So $\binom{\mathbf{d}}{(N)} = 1$
  • If $m=2$, then you only have to handle choosing $k_1$ or $k_2$ elements from a multiset, because the rest will be the other set. So, as mentioned below, you can use a generating function and $\binom{\mathbf{d}}{(k_1,k_2)}$ is equal to the coefficient of $x^{k_1}$ or $x^{k_2}$ in $\prod\limits_{i=1}^l 1 + x^2 + \cdots + x^{d_i} = \prod\limits_{i=1}^l \frac{1-x^{d_i – 1}}{1 – x}$. But then you also need to account for the fact that order doesn't matter, which I'm not sure how to do. Like in the first example below, you would find that there are $3$ ways to choose $2$ elements, but there are only $2$ ways to split the multiset because you have to choose 2 of them that are compatible.

Examples

Let's say that $\mathbf{d} = (2, 2)$, so $S(\mathbf{d})$ might be $\{a, a, b, b\}$. Let $k_1 = k_2 = 2$, so we need to find all the ways of splitting $S$ into two sub-multisets of size $2$. There are exactly $2$ ways of doing this: $\{\{a,a\},\{b,b\}\}$ and $\{\{a,b\},\{a,b\}\}$, so $\binom{(2,2)}{(2,2)} = 2$.

Another example: $\mathbf{d} = (2,2)$, so $S(\mathbf{d})$ could be $\{a,a,b,b\}$. Let $k_1 = 1$, $k_2 = 1$, and $k_3 = 2$. There are $3$ ways of doing this: $\{\{a\},\{a\},\{b,b\}\}$, $\{\{b\},\{b\},\{a,a\}\}$, and $\{\{a\},\{b\},\{a,b\}\}$. So $\binom{(2,2)}{(1,1,2)}=3$.

My Attempts

I've tried to figure this out two ways. The first was to find a recurrence relation and some base cases, kind of how Stirling numbers of the second kind can be computed using the identity $S(n,k) = kS(n-1,k) + S(n-1,k-1)$. I tried to think about what happens if you already have a partition and want to add an element to the original multiset, but then you have to decide which bin that element will go into or whether or not to add a new bin.

I also tried to derive it in the way that multinomial coefficients are derived, by counting the number of ways to fill the first bin, and then the second, and so on. The number of ways to choose $k_1$ elements from the multiset to put in the first bin can be computed by finding the coefficient of $x^{k_1}$ in $\prod\limits_{i=1}^l 1+x+x^2+\cdots+x^{d_i}$, which isn't explicit but it's a start. But then, depending on which elements you chose, you don't know how to adjust your multiset to reflect the remaining elements.

Best Answer

It would appear that these are multisets of multisets which may be enumerated using the Polya Enumeration Theorem (PET). Let the multiset that is being drawn have factorization

$$\prod_{k=1}^m B_k^{\sigma_{k}}$$

where $k$ is the value of a term and $\sigma_k$ the number of times it ocurs and recall that we have $l$ types of items forming the source multiset

$$\prod_{k=1}^l A_k^{\tau_{k}}.$$

The answer is then given by

$$\left[\prod_{k=1}^l A_k^{\tau_{k}}\right] \prod_{k=1}^m Z\left(S_{\sigma_k}; Z\left(S_k; \sum_{k'=1}^l A_{k'}\right)\right).$$

In terms of combinatorial classes we have made use of the unlabeled class

$$\def\textsc#1{\dosc#1\csod} \def\dosc#1#2\csod{{\rm #1{\small #2}}} \textsc{MSET}_{=\sigma_k} \left(\textsc{MSET}_{=k} \left(\sum_{k'=1}^l \mathcal{A}_{k'}\right)\right).$$

As an example for ${2,2\choose 1,1,2} = 3$ we get

$$\textsc{MSET}_{=2} (\textsc{MSET}_{=1}(\mathcal{A_1}+\mathcal{A}_2)) \times \textsc{MSET}_{=1} (\textsc{MSET}_{=2}(\mathcal{A_1}+\mathcal{A}_2)).$$

As an additional example we find for ${2,2,4\choose 1,1,3,3} = 16$

$$\textsc{MSET}_{=2} (\textsc{MSET}_{=1}(\mathcal{A_1}+\mathcal{A}_2+\mathcal{A}_3)) \times \textsc{MSET}_{=2} (\textsc{MSET}_{=3}(\mathcal{A_1}+\mathcal{A}_2+\mathcal{A}_3)).$$

Here we have used the cycle index of the symmetric group $Z(S_n)$, which is computed from the recurrence by Lovasz which says that

$$Z(S_n) = \frac{1}{n} \sum_{l=1}^n a_l Z(S_{n-l}) \quad\text{where}\quad Z(S_0) = 1.$$

For this to be effective we need to compute the iterated cycle index when $Z(S_k)$ is substituted into $Z(S_{\sigma_k}).$ This is accomplished with the substitution rule for substitution of the former into the latter:

$$a_q = Z(S_k;\; a_1=a_q, \; a_2=a_{2q}, \; a_3=a_{3q}, \; \ldots).$$

We have used the notation $Z(S_k; F)$ for substitution of a generating function and on the previous line, the notation for substitution into the variables of the cycle index. This is in fact all we need and we can start computing some of these multiset coefficients. For example we find for the example given by OP the cycle index

$$Z(B_1^2 B_2) = 1/4\,{a_{{1}}}^{4}+1/2\,{a_{{1}}}^{2}a_{{2}}+1/4\,{a_{{2}}}^{2}.$$

Continuing with the example we get

$$Z(B_1^2 B_2; A_1+A_2) = 1/4\, \left( A_{{1}}+A_{{2}} \right) ^{4} +1/2\, \left( A_{{1}}+A_{{2}} \right) ^{2} \left( {A_{{1}}}^{2}+{A_{{2}}}^{2} \right) \\ +1/4\, \left( {A_{{1}}}^{2}+{A_{{2}}}^{2} \right) ^{2} \\ = {A_{{1}}}^{4}+2\,{A_{{1}}}^{3}A_{{2}} +3\,{A_{{1}}}^{2}{A_{{2}}}^{2}+2\,A_{{1}}{A_{{2}} }^{3}+{A_{{2}}}^{4}$$

and we confirm the value $3$ obtained by OP. This algorithm will make it possible to compute cycle indices not obtainable by enumeration. As an extra example we find the following excerpt from the cycle index for $[2,2,2,3,5,5]:$

$$Z(B_2^3 B_3 B_5^2) = \ldots +{\frac {11\,{a_{{1}}}^{8}a_{{2}}a_{{4}}a_{{5}}}{7200}} +{\frac {49\,{a_{{1}}}^{7}{a_{{2}}}^{2}a_{{3}}a_{{5}}}{14400}} \\ +{\frac {5\,{a_{{1}}}^{7} a_{{2}}{a_{{3}}}^{2}a_{{4}}}{1152}} +{\frac {1021\,{a_{{1}}}^{6}{a_{{2}}}^{3}a_{{3}}a_{{4}}}{69120}} +{\frac {43\,{a_{{1}}}^{7}a_{{2}}a_{{4}}a_{{6}}}{17280}}+\ldots$$

Here are some additional values that may assist the reader who is investigating this problem to verify the results of their approach:

$${1,3,3\choose 3,4} = 7, \; {2,3,3\choose 4,4} = 5, \; {2,3,3\choose 2,2,4} = 16 \quad\text{and}\quad {1,2,3,3\choose 2,3,4} = 87.$$

The Maple code for this problem was as follows.

with(combinat);


pet_cycleind_symm :=
proc(n)
option remember;

    if n=0 then return 1; fi;

    expand(1/n*
           add(a[l]*pet_cycleind_symm(n-l), l=1..n));
end;

pet_varinto_cind :=
proc(poly, ind)
local subs1, subs2, polyvars, indvars, v, pot, res;

    res := ind;

    polyvars := indets(poly);
    indvars := indets(ind);

    for v in indvars do
        pot := op(1, v);

        subs1 :=
        [seq(polyvars[k]=polyvars[k]^pot,
             k=1..nops(polyvars))];

        subs2 := [v=subs(subs1, poly)];

        res := subs(subs2, res);
    od;

    res;
end;

pet_cycleind_comp :=
proc(idxtrg, idx)
local varstrg, vars, vt, sbstrg, sbs, len;

    varstrg := indets(idxtrg);
    vars := indets(idx);

    sbstrg := [];

    for vt in varstrg do
        len := op(1, vt);

        sbs :=
        [seq(v = a[op(1, v)*len], v in vars)];

        sbstrg :=
        [op(sbstrg),
         a[len] = subs(sbs, idx)];
    od;

    expand(subs(sbstrg, idxtrg));
end;

pet_cycleind_mset :=
proc(msetlst)
option remember;
local mset, res, ent, idxtrg, idx;

    mset := convert(msetlst, `multiset`);

    res := 1;

    for ent in mset do
        idx := pet_cycleind_symm(ent[1]);
        idxtrg := pet_cycleind_symm(ent[2]);

        res := res *
        pet_cycleind_comp(idxtrg, idx);
    od;

    expand(res);
end;


GENF :=
proc(src, msetlst)
local vars, srcp, res, gf, term;

    vars := add(A[q], q=1..nops(src));
    srcp := mul(A[q]^src[q], q=1..nops(src));

    gf := expand(pet_varinto_cind
                 (vars, pet_cycleind_mset(msetlst)));

    if not type(gf, `+`) then
        gf := [gf];
    fi;

    res := 0;

    for term in gf do
        if type(srcp/term, `polynom`) then
            res := res + term;
        fi;
    od;

    res;
end;

The syntax to compute ${\mathrm{A}\choose \mathrm{B}}$ is documented by the following examples:

> GENF([1,2,3,3], [2,3,4]);

                        2     3     3
            87 A[1] A[2]  A[3]  A[4]

> GENF([1,2,3,3], [2,2,5]);

                        2     3     3
            33 A[1] A[2]  A[3]  A[4]

> GENF([1,1,1,1], [2,2]);  

             3 A[1] A[2] A[3] A[4].

The last one is $\frac{1}{2} {4\choose 2}.$

Addendum. There is a slight improvement on this algorithm at the following MSE link.

Related Question