[Math] Exploding (a.k.a open-ended) dice pool

diceprobability

Say we role $n$ identical, fair dice, each with $d$ sides (every side comes up with the same probability $\frac{1}{d}$). On each die, the sides are numbered from $1$ to $d$ with no repeating number, as you would expect. So an ordinary $d$ sided die pool.

Every dice in the outcome that shows a number equal or higher than the threshold number $t$ is said to show a hit. Every die that shows the maximum result of $d$ is rolled again, which we call "exploding". If the re-rolled dice show hits, the number of hits is added to the hit count. Dice that show the maximum after re-rolling are rolled again and their hits counted until none show a maximum result. Given the values of

$$ d\ …\ \text{Number of sides on each die}\ \ d>0 $$
$$ n\ …\ \text{Number of dies rolled}\ \ n\ge 0$$
$$ h\ …\ \text{Number of hits, we want the probability for}$$
$$ t\ …\ \text{Threshold value for a die to roll a hit}\ \ 0 < t \le d$$

what is the probability to get exactly exactly $h$ hits? Lets call it: $$p^\text{exploding}(d,n,t,h) = p_{d,n,t,h}$$
Can you derive a formula for this probability?

Example roll:

We roll 7 six-sided dice and count those as hits that show a 5 or a 6. In this example, $d=6$, $n=7$, $t=5$. The outcome of such a roll may be 6,5,1,2,3,6,1. That's three hits so far, but we have to roll the two sixes again (they explode). This time it's 6, 2. One more hit, and one more die to roll. We are at four hits at this point. The last die to be re-rolled shows 6 again, we re-roll it yet another time. On the last re-roll it shows a 4 – no more hits. That gives five hits in total and the roll is complete. So, for this roll $h=5$.

Simple case for just one die $n=1$:

If we roll only one die with the same threshold as above, so ($d=6$, $n=1$, $t=5$), the probabilities can be easily calculated:

$$ p_{6,1,5,0} = \frac{4}{6} \quad \text{(Probability for exactly 0 hits – roll 1-4 on the first roll, no explosion here)} $$
$$ p_{6,1,5,1} = \frac{1}{6} + \frac{1}{6} \cdot \frac{4}{6} \quad \text{(Probability for exactly 1 hit – roll either a 5 or a result of 1-4 after a 6)} $$
$$ p_{6,1,5,2} = \frac{1}{6} \cdot \frac{1}{6} + \frac{1}{6} \cdot \frac{1}{6} \cdot \frac{4}{6} \quad \text{(Probability for exactly 2 hits – either a 6 and 5 or two sixes and 1-4)} $$
$$ p_{d,1,t,h\ge 1} = \left(\frac{1}{d}\right)^{h-1}\frac{d-t}{d} + \left( \frac{1}{d} \right)^h \cdot \frac{t-1}{d} \quad \text{(Probability for exactly $h\ge 1$ hits – either $h-1$ maximum rolls and non-maximal success or $h$ maximum rolls and a non-success )} $$

Without Explosion:

For none-exploding dice the probability would just be binomially distributed:

$$ p^\text{non-exploding}_{d,n,t,h} = \binom{n}{h} \left( \frac{d-t+1}{d} \right)^h \left( 1 – \frac{d-t+1}{d} \right)^{n-h} $$

$$ E^\text{non-exploding}_{d,n,t} = n \frac{d-t+1}{d}; \qquad V^\text{non-exploding}_{d,n,t} = n \frac{(d-1)(d-t+1))}{d^2} $$

Where $E_{d,n,t}$ is the expected number of hits, and $V_{d,n,t}$ its variance.


Edit1: In the mean time I found Probability of rolling $n$ successes on an open-ended/exploding dice roll. However I'm afraid, I don't fully get the answer there. E.g. the author says $s = n^k + r$, which does not hold for his examples. Also I'm not sure how to get $s$, $k$ and $r$ from my input values stated above (which are $d$, $n$, $h$ and $s$).

Edit2: If one had the probability for $b$ successes via explosions, given that the initial role had $l$ successes prior to the explosions, one could just subtract all those probabilities for all values of $b$ from the value for the pure binomial distributions with $l$ successes and add the respective value to the pure binomial probability of $b+l$ successes. Just an idea. I suppose this should be something like a combination of geometric and binomial distribution.

Edit3: I accepted Brian Thug's excellent answer, giving the formula:
$$ p^\text{exploding}_{d,n,t,h} = \frac{(t-1)^n}{d^{n+h}}
\sum_{k=0}^{\max\{h, n\}} \binom{n}{k} \binom{n+h-k-1}{h-k}
\left[ \frac{d(d-t)}{t-1} \right]^k $$

$$ E^\text{exploding}_{d,n,t} = n\frac{d+1-t}{d-1}; \qquad V^\text{exploding}_{d,n,t} = E_{d,n,t} – n\frac{(d-t)^2-1}{(d-1)^2} $$

Here is a graph from a simulation (html) that illustrates the whole thing:

Comparison between exploding and non-exploding dice pools

Best Answer

ETA: OK, I think I've fixed the problem. Off-by-one error...

I think this can be done with generating functions. The generating function for a single die is given by

$$ F(z) = \frac{t-1}{d} + \frac{(d-t)z}{d} + \frac{zF(z)}{d} $$

We can interpret this as follows: The probability that there are no hits on the one die is $\frac{t-1}{d}$, so $F(z)$ has that as the coefficient for $z^0 = 1$. The probability that there is one hit and the die doesn't "explode" (repeat) is $\frac{d-t}{d}$, so $F(z)$ has that as the coefficient for $z^1 = z$. In the remaining $\frac{1}{d}$ of the cases, the die explodes and the situation is exactly as it was at the start, except that there is one hit already to our credit, which is why we have $zF(z)$: the $F(z)$ takes us back to the beginning, so to speak, and the multiplication by $z$ takes care of the existing hit.

This expression can be solved for $F(z)$ via simple algebra to yield

$$ F(z) = \frac{t-1+(d-t)z}{d-z} $$

whose $z^h$ coefficient gives the probability for $h$ hits. For example, for the simple case $n = 1, d = 20, t = 11$:

\begin{align} F(z) & = \frac{10+9z}{20-z} \\ & = \frac{10+9z}{20} \left(1+\frac{z}{20}+\frac{z^2}{20^2}+\cdots\right) \\ & = \left( \frac{1}{2} + \frac{9}{20}z \right) \left(1+\frac{z}{20}+\frac{z^2}{20^2}+\cdots\right) \\ \end{align}

and then we obtain the probability that there are $h$ hits from the $z^h$ coefficient of $F(z)$ as

$$ P(H = h) = \frac{1}{2\cdot20^h}+\frac{9}{20^h} = \frac{19}{2\cdot20^h} \qquad h > 0 $$

with the special case

$$ P(H = 0) = \frac{1}{2} $$

In general, we can obtain the expectation of the number of hits $\overline{H}$ as

$$ \overline{H} = F'(1) = \frac{d(d-t)+t-1}{(d-1)^2} = \frac{d+1-t}{d-1} $$


Now, for $n$ dice, we have

$$ [F(z)]^n = \left[ \frac{t-1+(d-t)z}{d-z} \right]^n $$

We can write this as $N(z)M(z)$, where

\begin{align} N(z) & = [t-1+(d-t)z]^n \\ & = \sum_{k=0}^n \binom{n}{k} (t-1)^{n-k}(d-t)^kz^k \end{align}

and

\begin{align} M(z) & = \left(\frac{1}{d-z}\right)^n \\ & = \frac{1}{d^n} \left( 1+\frac{z}{d}+\frac{z^2}{d^2}+\cdots \right)^n \\ & = \sum_{j=0}^\infty \binom{n+j-1}{j} \frac{z^j}{d^{n+j}} \end{align}

so we can obtain a closed form for $P(H = h)$ from the $z^h$ coefficient of $[F(z)]^n = N(z)M(z)$ as

\begin{align} P(H = h) & = \sum_{k=0}^{\max\{h, n\}} \binom{n}{k} \binom{n+h-k-1}{h-k} \frac{(t-1)^{n-k}(d-t)^k}{d^{n+h-k}} \\ & = \frac{(t-1)^n}{d^{n+h}} \sum_{k=0}^{\max\{h, n\}} \binom{n}{k} \binom{n+h-k-1}{h-k} \left[ \frac{d(d-t)}{t-1} \right]^k \end{align}

For example, for $n = 1, d = 6, t = 5$ (the example in the OP), the above expression yields

$$ P(H = h) = \frac{5}{3 \cdot 6^h} \qquad h > 0 $$

with the special case

$$ P(H = 0) = \frac{2}{3} $$

which coincides with the conclusions drawn in the comments to the OP.

The expectation for the number of hits could be obtained by evaluating $\frac{d}{dz} [F(z)]^n$ at $z = 1$, but owing to the linearity of expectation, it is obtained more straightforwardly as $n$ times the expected number of hits for one die, namely

$$ \overline{H} = \frac{n(d+1-t)}{d-1} $$

I think this all checks out, but some independent verification (or disproof, as appropriate) would be nice.

Related Question