[Math] How to calculate the number of distinct subsets of a set that has repeated elements

combinationscombinatorics

I am familiar with how to calculate the number of subsets of size $k$ from a set with $n$ elements such that order matters, e.g.: both $\{a,b,c\}$ and $\{c,a,b\}$ are counted as separate, using the formula $\frac{n!}{(n-k)!}$. I also know that you can use the formula $\frac{n!}{k!(n-k)!}$ to calculate the number of subsets such that order doesn't matter, e.g.: $\{a,b,c\}$ and $\{c,a,b\}$ are counted as the same.
Despite that, I don't know of any how to calculate the number of distinct subsets (including improper subsets) of a set that has repeated elements and can't find any information online. What I mean specifically by "distinct subsets" are two subsets $T$ and $U$ of a set $S$ where there exist a positive integer $i$ such that $T_i \neq U_i$ where $T_i$ and $U_i$ respectively represent the $i$th elements of $T$ and $U$. For this definition the empty set is not equal to any set except itself. If a set doesn't have any repeated elements I know that $\frac{n!}{(n-k)!}$ correctly calculates the number of distinct subsets.

Looking for a pattern I made the following calculations by hand, the first part of the list represents the base set ($S$ in the definition above) and the numbers that follow represent the number of distinct subsets of of size one, two, et cetera, with zeros omitted. The base sets are sorted in lexicographical (dictionary) order.

  1. $\{a\}$: 1
  2. $\{a,a\}$: 1,1
  3. $\{a,b\}$: 2,2
  4. $\{a,a,a\}$: 1,1,1
  5. $\{a,a,b\}$: 2,3,3
  6. $\{a,b,c\}$: 3,6,6
  7. $\{a,a,a,a\}$: 1,1,1,1
  8. $\{a,a,a,b\}$: 2,3,4,4
  9. $\{a,a,b,b\}$: 2,4,6,6
  10. $\{a,a,b,c\}$: 3,7,12,12

The main pattern of interest that I noticed is that for all of the sets the number of distinct subsets of size equal to the size of the base set and the number of size one less than the size of the base appear to always be the same, for example, for $\{a,a,b,b\}$ there are 6 distinct subsets of size 3 as well as size 4. I tried looking on the OEIS (On-line Encyclopedia of Integer Sequences) for the column sequences haven't found anything and the rows are too short that I can't personally find anything clear on the OEIS from them.

I'm unsure if there is a particular way I should be approaching this problem so any advice, work done by others investigating the problem, and/or resources would be greatly appreciate. As a note, while writing this question the StackExchange software listed the following questions as similar, in my opinion they seem related to me, but how to connect them to my question specifically I am not certain of: How many different permutations are there of the sequence of letters in “MISSISSIPPI”? [specific answer], How to find the number of distinct combinations of a non distinct set of elements? [question], and Permutations with fixed number of distinct elements [question].

Best Answer

Let $E_{m}(x)=\sum_{j=0}^m x^j/j!$ be the $m^{th}$ partial sum of the exponential series. If a multiset $M$ has $r$ distinct elements, where the first element is repeated $n_1$ times, the second $n_2$ times, etc, then the number of ways to choose an ordered list consisting of $k$ elements of $M$ is equal to $$ k![x^k]\prod_{i=1}^rE_{n_i}(x).\tag{*} $$ Here, $[x^k]f(x)$ denotes the coefficient of $x^k$ in the polynomial $f(x)$.

For example, consider the multiset $\{a,a,b,c\}$ from your post. There are $3$ distinct elements, the first, $a$, appearing $n_1=2$ times, and the latter two, $b$ and $c$, appearing $n_2=n_3=1$ time. The product of the partial exponential sums in $(*)$ is therefore \begin{align} E_{2}(x)\cdot E_1(x)\cdot E_1(x) &=(1+x+x^2/2)\cdot(1+x)\cdot(1+x) \\&=1+3x+\frac{7}2x^2+2x^3+\frac12x^4 \\&=1+\frac{\color{red}3}{1!}x+\frac{\color{red}7}{2!}x^2+\frac{\color{red}{12}}{3!}x^3+\frac{\color{red}{12}}{4!}x^4\end{align} Notice that the coefficients of this polynomial correspond to the answer to your combinatorial question $(3,7,12,12)$, divided by an appropriate factorial.


The techniques used in this post are more widely known as exponential generating functions. For more of an explanation on why this works, see generatingfunctionology by Herbert Wilf, especially chapter 3 on exponential generating functions. It is available for free online.

Related Question