Probability of selecting objects from categories with replacement, no ordering, and probability weights

combinationscombinatoricspermutationsprobabilitystatistics

Let there exist $n$ categories from which objects can be chosen. Let us denote these categories as $c_1, c_2, \ldots, c_n$.

The chance of each category being selected if drawing a random object is $p_1, p_2, \ldots, p_n$, respectively.

We will choose $k$ objects from these $n$ categories. With these $k$ objects, the ordering does not matter (i.e. $c_1, c_1, c_2$ is equivalent to $c_2, c_1, c_1$), and the $n$ categories will always maintain their respective proportion (i.e. with replacement).

How would you find the probability of choosing a specific combination of $k$ items, and is there a general way or formula to do so?

Example 1: Imagine that there are two categories of balls in a bag: red and blue. You have a 40% chance of drawing a red one and 60% chance of drawing a blue one. If you draw 5 balls, what is the chance you get 3 blue balls and 2 red ones?

Example 2: Imagine that there are two categories of balls in a bag: red and blue. You have a 50% chance of drawing either. If you draw 5 balls, what is the chance you get 3 blue balls and 2 red ones? Note that the chance of drawing 3 blue and 2 red is likely higher than drawing all 5 blue in this scenario (if my logic is correct) since order does not matter in this question. So, the chance of drawing 5 blue balls would be $\frac{\pmatrix{5 \\ 5}}{2^5}$ due to there being only one permutation whereas there are 10 permutations for 3 blue and 2 red (BBBRR, BBRBR, BRBBR, RBBBR, BBRRB, BRBRB, RBBRB, BRRBB, RBRBB, and RRBBB), making the probability $\frac{\pmatrix{5 \\ 2}}{2^5}$ or $\frac{\pmatrix{5 \\ 3}}{2^5}$, and this example doesn't even have different probabilities for each category like example 1 does.

Best Answer

Thanks to Daniel Muñoz for suggesting the multinomial distribution, which is exactly what I was looking for.

Let us define that the number of each category $c_i$ chosen is $v_1, v_2, \ldots, v_n$, respectively, such that $\sum_{i = 1}^{n} v_i = k$ (and obviously that $v_i \ge 0$).

Then, the formula that I was looking for would be the following: $$\frac{k!}{\prod_{i = 1}^{n} v_i!} \prod_{i = 1}^{n} p_i^{v_i}$$

This satisfies both examples, which can each be simplified to binomial distribution as mentioned by Daniel Muñoz.

Example 1: $\frac{5!}{3!2!}\left(\frac{1}{2}\right)^{3}\left(\frac{1}{2}\right)^{2} = \frac{5}{16}$.

Example 2: $\frac{5!}{3!2!}\left(\frac{6}{10}\right)^{3}\left(\frac{4}{10}\right)^{2} = \frac{216}{625}$.