[Math] Probability of rolling any set of size k on n dice with m sides

binomial distributionbinomial-coefficientscombinatoricsdiceprobability

I have been trying to find a solution for this problem for some while and after quite some research I got referred here.

I am looking for a mutable way to calculate the odds of rolling any set of any size on a certain number of dice, a set being dice showing the same faces.
For example: Exactly $2$ of $7$ ten-sided dice showing the same face, where the face could be any of the sides.
Or: Exactly $4$ of $9$ ten-sided dice showing the same face.

Many of the solutions I've found so far only work for very specific scenarios (like calculating the odds for zero sets and subtracting from $1$ to get the probability of at least one set of doubles).

I feel my best try so far has been by way of combinatorics. Assuming the first example:
There are $10$ possibilities for the face of the set, ${}^7\mathrm C_2 (=21)$ possibilities to pick for the set and ${}^9\mathrm C_5 (=126)$ possibilities to pick five unique faces for the remaining five dice. Divided by the total number of $10^7$ possible combinations, that gives me $0.002646$, or $2.646\%$ to roll any doubles and five singles on seven dice, which seems a tad low.

Ideally, there would also be a way to easily expand the formula to easily calculate the probability of several sets on one roll.

Best Answer

Suppose you want the probability of rolling exactly $5$ of a kind in $9$ normal dice. There are $6$ ways to choose the number on the set, $\binom 95$ to choose the dice which show that number, and $5^4$ ways to choose the results of the other dice, giving a probability of $\frac{6\binom 95 5^4}{6^9}$.

(Here $\binom nr={}^n\mathrm C_r$ is the binomial coefficient.)

Unfortunately it's not that simple in general. What goes wrong if you try the same approach for $4$ of a kind? It looks like there should be $6\binom 945^5$ rolls which work, but here it's possible that the "other" $5$ dice not in the set will give you another set of $4$. If this happens you've counted the same roll twice: a set of $2$s on dice $1,3,4,7$ with the other dice showing $3,5,3,3,3$ is the same as a set of $3$s on dice $2,6,8,9$ with the rest showing $2,2,2,5,2$. So you need to subtract off combinations with two sets of $4$.

In general you need the inclusion-exclusion formula, and you have to keep going until you hit the maximum number of sets you can fit in. So a general formula for the number of ways to make a set of $k$ from $n$ dice with $r$ sides would be: $$\sum_{i=1}^{\min(r,\lfloor n/k\rfloor)}(-1)^i\frac{n!}{k!^i(n-ik)!}\binom ri(r-i)^{n-ki},$$ and then you divide by $r^k$ to get the probability.

(Here we use the convention that if $r-i=n-ki=0$ then $(r-i)^{n-ki}=1$.)

The $i$th term in the above formula corresponds to counting the number of ways with $i$ sets, not worrying about overcounting (because later terms fix the overcounting). So there are $\binom ri$ ways to choose the numbers on the $i$ sets, $\binom nk$ ways to choose which dice correspond to the lowest numbered set, $\binom{n-k}k$ for the second, and so on (the product of those terms comes out as $\frac{n!}{k!^i(n-ik)!}$), and finally $(r-i)^{n-ki}=1$ possible rolls of the leftover dice, if any.

Related Question