[Math] Probability of rolling a value a certain number of times in a certain number of rolls

binomial theorembinomial-coefficientsdiceprobabilitystatistics

I am familiar with the formula to calculate the probability of rolling a certain number at least once given $x$ rolls of a die with $y$ sides, that is:

$$P_1(x, y) = 1 – \left(1 – \frac{1}{y}\right)^x$$

where the subscript $1$ indicates that the number should be rolled at least once. For example, the chance of rolling a 6 in 6 rolls of a D6 would be $P_1(6, 6) = 1 – \left(1 – \frac{1}{6}\right)^6 \approx 0.665$.

This got me thinking what the probability would be to roll a number at least $n$ times given the same die conditions. I manually worked out the probability for n = 2:

$$P_2(x, y) = \frac{x}{y}-\left(1 – \left(1 – \frac{1}{y}\right)^x\right)$$

This does work even if only two dice are rolled, in which case the probability should simply be $(\frac{1}{y})^2$, and it is. For a D20, $P_2(2, 20) = (\frac{1}{20})^2 = \frac{1}{400}$.

After that I tried to figure out how to represent $P_3(x, y)$ but unfortunately I was unable to do so. Previously, I was effectively considering a binomial coefficient in geometric terms. Using $P_n(3, y)$ as an example and looking for rolls of 20 for simplicity's sake, I considered a cube of side length $y$ divided into $y^3$ unit cubes. For $P_1$ I took the face cubes, subtracted the edge cubes, and added back the corner to give me the number of rolls in which a 20 appeared. For $P_2$, the formula was edge cubes $-$ corner for all rolls in which two 20s appeared. I know this all involves binomial coefficient, but I never really took a proper stats class so my knowledge of its application is somewhat limited. To find a general case for $P_3$, I would have to consider a 4-cube which I tried and failed.

I'm sure this could be done easily using binomial coefficient, but that's sort of why I'm asking about this. I'm fairly sure that my expression for $P_2$ only has that extra $\frac{x}{y}$ term because $\binom{3}{1}$ and $\binom{3}{2}$ happen to equal 3 and therefore could be algebraically reorganized.

My question is this: Is there a general case formula for $P_n(x, y)$ that represents the probability of rolling a number $n$ times given $x$ rolls of a $y$-sided die. Additionally, would said formula be different if the requirement was to roll one number $n_1$ times and another number $n_2$ times and so on? Is there a different general case for $P_{{n_1}, {n_2}, …}(x, y)$?

Best Answer

You’ve asked several questions here. I’ll address the first one: finding the probability of getting some number at least $x$ times in $n$ rolls. You’re basically asking about the cumulative distribution function of the Binomial distribution.

For a random variable $X$, the c.d.f. of its distribution gives the probability $\Pr(X\le x)=F(x)$. The parameter $x$ is usually taken to be a real number. For a discrete random variable, the c.d.f. can be simply expressed by a sum of values of its p.d.f. (probability distribution function a.k.a. probability mass function): assuming that $X$ can take on positive integral values, $$F(x)=\sum_{k=0}^{\lfloor x\rfloor}\Pr(X=k).$$ ($\lfloor x\rfloor$ stands for the floor of $x$, the greatest integer less than or equal to $x$).

As you suspected, binomial coefficients are involved in die roll probabilities. When looking for some specific single value to come up, each die roll can be viewed as an independent success/failure (“Bernoulli”) trial with a uniform probability $p$ of success. To reduce clutter, its common to use $q=1-p$ for the probability of a failure in these calculations. In the following, let $X$ be a random variable that counts the number of successes. The probability of exactly $k$ successes in $n$ trials is pretty easy to compute: The probability that the first $k$ trials are successes and the rest failures is obviously $p^kq^{n-k}$. There are $\binom nk$ possible sequences of $k$ successes and $n-k$ failures, so $\Pr(X=k)=\binom nkp^kq^{n-k}$, as described in G Cab’s answer. The probability of at most $k$ successes is then $$\Pr(X\le k)=\sum_{i=0}^{\lfloor k\rfloor}\binom nip^kq^{n-i}.$$ As G Cab mentions, partial sums of binomial coefficients don’t usually have a nice closed form, but in this case, there is an expression of this sum as an integral: $$\Pr(X\le k)=(n-k)\binom nk\int_0^qt^{n-k-1}(1-t)^k\,dt$$ (see the article on the Binomial distribution that I linked at top for more details.) For large numbers of dice, a normal approximation to this value can be cheaper/easier to compute.

The generalization you ask about at the end of your question is the Multinomial distribution, which I won’t go into here. The Wikipedia article on it is a reasonable place to start reading about that.

Related Question