[Math] How is the formula for counting multisets derived

combinatoricsmultisets

I came across a formula that states that the number of ways to make a multiset of cardinality $n$ can be formed from a set of cardinality $k$ is $\binom{n + k – 1}{n}$. How exactly is this derived?

Best Answer

The number of ways we can select $n$ objects from $k$ types of objects is the number of solutions of the equation $$x_1 + x_2 + \ldots + x_k = n \tag{1}$$ in the nonnegative integers. A particular solution of equation 1 corresponds to the insertion of $k - 1$ addition signs in a row of $n$ ones.

For example, suppose $n = 12$ and $k = 6$. Then $$1 1 + 1 1 1 + 1 + + 1 1 1 1 + 1 1$$ corresponds to the solution $x_1 = 2$, $x_2 = 3$, $x_3 = 1$, $x_4 = 0$, $x_5 = 4$, and $x_6 = 2$.

Therefore, the number of solutions of equation 1 is equal to the number of ways we can insert $k - 1$ addition signs in a row of $n$ ones, which is $$\binom{n + k - 1}{k - 1}$$ since we must choose which $k - 1$ of the $n + k - 1$ positions required for $n$ ones and $k - 1$ addition signs will be filled with addition signs. Alternatively, we obtain the formula $$\binom{n + k - 1}{n}$$ by selecting which $n$ of the $n + k - 1$ positions required for $n$ ones and $k - 1$ addition signs will be filled with ones.

Related Solutions

[Math] Permutations of Subsets of a Multiset

Lets say $S$ is the multiset and $m_k$ is the multiplicity of the element $k$ of the multiset $$ S=(m_1,m_2,...,m_n) $$ and we want all the unique permutations of length $l$ of the multiset. For the formula we have to find all the combinations $C=(x_1,x_2,...,x_n)$ where $0 \leq x_k \leq m_k$ such that $$ \sum_{k=1}^n x_k=l $$

What we need is actually the productory of the factorials of the elements of that combination: $$ P(C)=\prod_{k=1}^n x_k! $$ Now lets say that the number of combinations is J, so to answer your question the number of permutation is $$ \sum_{j=1}^J \frac{l!}{P(Cj)} $$ or the closed-form expression, where $\binom{n}{k_1,k_2,...,k_m}=\frac{n!}{k_1!k_2!...k_m!}$ is a multinomial coefficient $$ \sum_{x_1+x_2+...+x_n=l} \binom{l}{x_1,x_2,...,x_n} $$ Lets try this formula with this example: the multiset is $S=(3,2,1)$ and $l=4$. All the combinations are $$ C=((1,2,1),(2,1,1),(2,2,0),(3,0,1),(3,1,0)) $$ so the number of permutation is $$ \frac{4!}{1!2!1!}+\frac{4!}{2!1!1!}+\frac{4!}{2!2!0!}+\frac{4!}{3!0!1!}+\frac{4!}{3!1!0!}=\frac{24}{2}+\frac{24}{2}+\frac{24}{4}+\frac{24}{6}+\frac{24}{6}=38 $$ I'm going to explain the formula using again my example. We want a permutation of length $l$ so it's like having 4 slots to fill up with what we have. In this case we have our multiset $S=(3,2,1)$. We start with the first element $m_1=3$. How many of $m_1$ we can use, remembering that we have $l$ free slots and that after $m_1$ we have $m_2+m_3=3$ to use? The answer is $$ MAX(0,l-m_2-m_3) \leq x_1 \leq MIN(l,m_1) $$ We can't use less or more or we won't reach $l$ at the end, so $x_1=(1,2,3)$. Lets start with $x_1=1$. What is the number of combination with $4$ free slots and $1$ to use? This is the binomial coefficient $$ \binom{l}{x_1}=\binom{4}{1} $$ What I'm going to do is to repeat what I did until I completely fill all the slots. Now we have $l-x_1=3$ free slots and $m_2$ to use. Of $m_2$ I can use no less than $MAX(0,l-x_1-m_3)$ and no more than $MIN(l-x_1,m_2)$, so $x_2=(2)$ as it's the only choice. What is the number of combination with 3 free slots and 2 to use? Like before, it's $$ \binom{l-x_1}{x_2}=\binom{3}{2} $$ Combined with $x_1=1$ it becomes $$ \binom{l}{x_1}\binom{l-x_1}{x_2}=\binom{4}{1}\binom{3}{2} $$ Now we have $l-x_1-x_2=1$ free slots and $m_3$ to use. One slot can be filled in one way only, so $x_3=1$ and the combinations are $$ \binom{l-x_1-x_2}{x_3}=\binom{1}{1} $$ Combined with $x_1=1$ and $x_2=2$ it becomes $$ \binom{l}{x_1}\binom{l-x_1}{x_2}\binom{l-x_1-x_2}{x_3}=\binom{4}{1}\binom{3}{2}\binom{1}{1} $$ We can clearly see a pattern here so we can try to get a nicer formula from that $$ \binom{l}{x_1}\binom{l-x_1}{x_2}\binom{l-x_1-x_2}{x_3}=\frac{l!}{x_1!(l-x_1)!}\frac{(l-x_1)!}{x_2!(l-x_1-x_2)!}\frac{(l-x_1-x_2)!}{x_3!(l-x_1-x_2-x_3)!} $$ We can simplify that and get $$ \frac{l!}{x_1!x_2!x_3!(l-x_1-x_2-x_3)!} $$ Since we know that $x_1+x_2+x_3=l$ we can simplify again and get $$ \frac{l!}{x_1!x_2!x_3!(l-x_1-x_2-x_3)!}=\frac{l!}{x_1!x_2!x_3!0!}=\frac{l!}{x_1!x_2!x_3!} $$ $$ \frac{l!}{x_1!x_2!x_3!}=\frac{4!}{1!2!1!} $$ Now lets get back where we were. Since we filled up all the slot we have to go back and since $x_1=(1)$ and $x_2=(2)$ we have to get back to $x_1=(1,2,3)$ and pick the second, so $x_1=2$. With $x_1=2$ follows up that $x_2=(1,2)$. Since there are more than 1 way to choose $x_2$, we can write it this way $$ \binom{4}{2}(\binom{2}{1}+\binom{2}{2}) $$ If we continue with $x_2=1$ we get $x_3=1$, and if we continue with $x_2=2$ we get $x_3=0$ and get $$ \binom{4}{2}(\binom{2}{1}\binom{1}{1}+\binom{2}{2}\binom{0}{0})=\binom{4}{2}\binom{2}{1}\binom{1}{1}+\binom{4}{2}\binom{2}{2}\binom{0}{0}=\frac{4!}{2!1!1!}+\frac{4!}{2!2!0!} $$ Lets get back the last time to $x_1$ and pick up the last, so $x_1=3$. With $x_1=3$ it follow that $x_2=(0,1)$, so $$ \binom{4}{3}(\binom{1}{0}+\binom{1}{1}) $$ If we continue with $x_2=0$ we get $x_3=1$, and if we continue with $x_2=1$ we get $x_3=0$ and get $$ \binom{4}{3}(\binom{1}{0}\binom{1}{1}+\binom{1}{1}\binom{0}{0})=\binom{4}{3}\binom{1}{0}\binom{1}{1}+\binom{4}{3}\binom{1}{1}\binom{0}{0}=\frac{4!}{3!0!1!}+\frac{4!}{3!1!0!} $$ If we sum up all the results we get the same formula as before.

Combinatorics – Partitioning a Multiset into Multisets of Fixed Sizes

It would appear that these are multisets of multisets which may be enumerated using the Polya Enumeration Theorem (PET). Let the multiset that is being drawn have factorization

$$\prod_{k=1}^m B_k^{\sigma_{k}}$$

where $k$ is the value of a term and $\sigma_k$ the number of times it ocurs and recall that we have $l$ types of items forming the source multiset

$$\prod_{k=1}^l A_k^{\tau_{k}}.$$

The answer is then given by

$$\left[\prod_{k=1}^l A_k^{\tau_{k}}\right] \prod_{k=1}^m Z\left(S_{\sigma_k}; Z\left(S_k; \sum_{k'=1}^l A_{k'}\right)\right).$$

In terms of combinatorial classes we have made use of the unlabeled class

$$\def\textsc#1{\dosc#1\csod} \def\dosc#1#2\csod{{\rm #1{\small #2}}} \textsc{MSET}_{=\sigma_k} \left(\textsc{MSET}_{=k} \left(\sum_{k'=1}^l \mathcal{A}_{k'}\right)\right).$$

As an example for ${2,2\choose 1,1,2} = 3$ we get

$$\textsc{MSET}_{=2} (\textsc{MSET}_{=1}(\mathcal{A_1}+\mathcal{A}_2)) \times \textsc{MSET}_{=1} (\textsc{MSET}_{=2}(\mathcal{A_1}+\mathcal{A}_2)).$$

As an additional example we find for ${2,2,4\choose 1,1,3,3} = 16$

$$\textsc{MSET}_{=2} (\textsc{MSET}_{=1}(\mathcal{A_1}+\mathcal{A}_2+\mathcal{A}_3)) \times \textsc{MSET}_{=2} (\textsc{MSET}_{=3}(\mathcal{A_1}+\mathcal{A}_2+\mathcal{A}_3)).$$

Here we have used the cycle index of the symmetric group $Z(S_n)$, which is computed from the recurrence by Lovasz which says that

$$Z(S_n) = \frac{1}{n} \sum_{l=1}^n a_l Z(S_{n-l}) \quad\text{where}\quad Z(S_0) = 1.$$

For this to be effective we need to compute the iterated cycle index when $Z(S_k)$ is substituted into $Z(S_{\sigma_k}).$ This is accomplished with the substitution rule for substitution of the former into the latter:

$$a_q = Z(S_k;\; a_1=a_q, \; a_2=a_{2q}, \; a_3=a_{3q}, \; \ldots).$$

We have used the notation $Z(S_k; F)$ for substitution of a generating function and on the previous line, the notation for substitution into the variables of the cycle index. This is in fact all we need and we can start computing some of these multiset coefficients. For example we find for the example given by OP the cycle index

$$Z(B_1^2 B_2) = 1/4\,{a_{{1}}}^{4}+1/2\,{a_{{1}}}^{2}a_{{2}}+1/4\,{a_{{2}}}^{2}.$$

Continuing with the example we get

$$Z(B_1^2 B_2; A_1+A_2) = 1/4\, \left( A_{{1}}+A_{{2}} \right) ^{4} +1/2\, \left( A_{{1}}+A_{{2}} \right) ^{2} \left( {A_{{1}}}^{2}+{A_{{2}}}^{2} \right) \\ +1/4\, \left( {A_{{1}}}^{2}+{A_{{2}}}^{2} \right) ^{2} \\ = {A_{{1}}}^{4}+2\,{A_{{1}}}^{3}A_{{2}} +3\,{A_{{1}}}^{2}{A_{{2}}}^{2}+2\,A_{{1}}{A_{{2}} }^{3}+{A_{{2}}}^{4}$$

and we confirm the value $3$ obtained by OP. This algorithm will make it possible to compute cycle indices not obtainable by enumeration. As an extra example we find the following excerpt from the cycle index for $[2,2,2,3,5,5]:$

$$Z(B_2^3 B_3 B_5^2) = \ldots +{\frac {11\,{a_{{1}}}^{8}a_{{2}}a_{{4}}a_{{5}}}{7200}} +{\frac {49\,{a_{{1}}}^{7}{a_{{2}}}^{2}a_{{3}}a_{{5}}}{14400}} \\ +{\frac {5\,{a_{{1}}}^{7} a_{{2}}{a_{{3}}}^{2}a_{{4}}}{1152}} +{\frac {1021\,{a_{{1}}}^{6}{a_{{2}}}^{3}a_{{3}}a_{{4}}}{69120}} +{\frac {43\,{a_{{1}}}^{7}a_{{2}}a_{{4}}a_{{6}}}{17280}}+\ldots$$

Here are some additional values that may assist the reader who is investigating this problem to verify the results of their approach:

$${1,3,3\choose 3,4} = 7, \; {2,3,3\choose 4,4} = 5, \; {2,3,3\choose 2,2,4} = 16 \quad\text{and}\quad {1,2,3,3\choose 2,3,4} = 87.$$

The Maple code for this problem was as follows.

with(combinat);


pet_cycleind_symm :=
proc(n)
option remember;

    if n=0 then return 1; fi;

    expand(1/n*
           add(a[l]*pet_cycleind_symm(n-l), l=1..n));
end;

pet_varinto_cind :=
proc(poly, ind)
local subs1, subs2, polyvars, indvars, v, pot, res;

    res := ind;

    polyvars := indets(poly);
    indvars := indets(ind);

    for v in indvars do
        pot := op(1, v);

        subs1 :=
        [seq(polyvars[k]=polyvars[k]^pot,
             k=1..nops(polyvars))];

        subs2 := [v=subs(subs1, poly)];

        res := subs(subs2, res);
    od;

    res;
end;

pet_cycleind_comp :=
proc(idxtrg, idx)
local varstrg, vars, vt, sbstrg, sbs, len;

    varstrg := indets(idxtrg);
    vars := indets(idx);

    sbstrg := [];

    for vt in varstrg do
        len := op(1, vt);

        sbs :=
        [seq(v = a[op(1, v)*len], v in vars)];

        sbstrg :=
        [op(sbstrg),
         a[len] = subs(sbs, idx)];
    od;

    expand(subs(sbstrg, idxtrg));
end;

pet_cycleind_mset :=
proc(msetlst)
option remember;
local mset, res, ent, idxtrg, idx;

    mset := convert(msetlst, `multiset`);

    res := 1;

    for ent in mset do
        idx := pet_cycleind_symm(ent[1]);
        idxtrg := pet_cycleind_symm(ent[2]);

        res := res *
        pet_cycleind_comp(idxtrg, idx);
    od;

    expand(res);
end;


GENF :=
proc(src, msetlst)
local vars, srcp, res, gf, term;

    vars := add(A[q], q=1..nops(src));
    srcp := mul(A[q]^src[q], q=1..nops(src));

    gf := expand(pet_varinto_cind
                 (vars, pet_cycleind_mset(msetlst)));

    if not type(gf, `+`) then
        gf := [gf];
    fi;

    res := 0;

    for term in gf do
        if type(srcp/term, `polynom`) then
            res := res + term;
        fi;
    od;

    res;
end;

The syntax to compute ${\mathrm{A}\choose \mathrm{B}}$ is documented by the following examples:

> GENF([1,2,3,3], [2,3,4]);

                        2     3     3
            87 A[1] A[2]  A[3]  A[4]

> GENF([1,2,3,3], [2,2,5]);

                        2     3     3
            33 A[1] A[2]  A[3]  A[4]

> GENF([1,1,1,1], [2,2]);  

             3 A[1] A[2] A[3] A[4].

The last one is $\frac{1}{2} {4\choose 2}.$

Addendum. There is a slight improvement on this algorithm at the following MSE link.

Best Answer

Related Solutions

[Math] Permutations of Subsets of a Multiset

Combinatorics – Partitioning a Multiset into Multisets of Fixed Sizes

Related Question