Find total number of unique subsets in an array containing duplicates

combinationscombinatorics

We are given an integer array containing duplicates. The array is [2, 2, 2, 2, 2, 3, 3, 3, 3, 3, 3, 5, 5, 5, 5, 7, 7] .

After converting the array into a map of <integer, frequencies>, the question goes like this:

After converting our array into map, our map looks like this:

Integer————Frequency

2———————->5

3———————->6

5———————->4

7———————->2

We have to count the total numbers of subsets that we can form using these integers (subsets – set of unique integers)

For example, one subset can be: [2,3,5]

But since there are 5 occurrences of 2, we can have 4 more subsets [2,3,5].

Note: [2,2] , this is not a valid subset, so we don't have to count it.

NOTE: Much Better explanation of my question

For more clarity, let's say we have a small array = [2,2,3].
So the total number of subsets can be 2^(unique integers) = 2^2 = 4;
This is only when we are not considering each occurrence of 2 as unique.

When we consider each 2 as unique, total subsets we can get are:

{2 (2 at 0th index)}, {2 (2 at 1st index)}, {3}, {2 (2 at 0th index) ,3}, {2 (2 at 1st index),3}, ∅.

Therefore total 7 subsets can be obtained. Note that I have not included {2,2} in total number of subsets because we can't consider a set have repeating integers.

How can we solve this problem using combinatorics?
Any help would be really appreciated!!

Kindly let me know if anything is unclear in the question.

Best Answer

Normally when you count the number of subsets that are possible, each element can either be included or excluded so there are $2$ choices for each element, and you get the answer $2^k$ where $k$ is the number of elements.

In your situation that is no longer the case. Each element could be excluded, or included in several ways depending on which of the available copies of that element you choose. So if the frequency of $2$ was $5$, as in your first example, there would be $6$ choices you could make to deal with element $2$ - exclude it or choose one of the $5$ copies.

To get the total number of these subsets you therefore have to add one to all of the frequencies and multiply those together.

In your first example the frequences of $2,3,5,7$ are $5,6,4,2$ respectively, so the answer is $6\cdot 7\cdot 5\cdot 3=630$.

Note that this is equivalent to counting the number of divisors of $2^5 3^6 5^4 7^2$. Each divisor corresponds to a subset by interpreting any non-zero exponents as the (1-based) index of the element that is chosen.

Related Question