How to calculate the probability of reaching sum $S$ by adding the results of an arbitrary number of rolls of an $n$-sided die

binomial-coefficientscombinatoricsdiceprobability

Suppose we have a fair $n$-sided die, with faces labeled from 1 to $n$ and an equal probability of rolling any available number. Further suppose that we pick a sum $S$ that we wish to reach by rolling the die an arbitrary number of times and adding the result of each roll. I want to calculate the probability of [the sum of the rolls] hitting $S$ for any values of $S$ and $n$. I know that for large $S$ the probability tends towards $\frac{2}{n+1}$, but I want to calculate the exact value for specific inputs, not just the behavior in the limit.

For example, say we have a six-sided die and we want to reach a target sum of 10. We roll once and get a value of 2, initializing the sum at 2. We roll again and get a value of 6, giving a sum of 8. We roll again and get a value of 3, giving a sum of 11, meaning we missed our target $S$.

To calculate the probability $P$ of any value of $S$ being reached, we can sum up the probability of each possible run of rolls that would produce that sum.

This is straightforward for $S$ less than or equal to $n$, as it is just a sum of

  • [probability of rolling $S$ on the first roll, $\frac{1}{n^1}$]
  • plus [number of ordered ways to get a sum of $S$ from two numbers] × $\frac{1}{n^2}$
  • plus [number of ordered ways to get a sum of $S$ from three numbers] × $\frac{1}{n^3}$
  • […et cetera, all the way down to…]
  • plus [probability of rolling $S$ ones all in a row, $\frac{1}{n^S}$]

Enumerating all possible permutations of rolls which sum to $S$ (the compositions of $S$) is simple, but gets long very quickly with increasing $S$, as the total number of compositions is $2^{S – 1}$. The first four sets of compositions are:

  • S = 1
    • 1
  • S = 2
    • 2
    • 1 + 1
  • S = 3
    • 3
    • 2 + 1
    • 1 + 2
    • 1 + 1 + 1
  • S = 4
    • 4
    • 3 + 1
    • 2 + 2
    • 2 + 1 + 1
    • 1 + 3
    • 1 + 2 + 1
    • 1 + 1 + 2
    • 1 + 1 + 1 + 1

It is possible to generate the next set ($S$ + 1) by taking a copy of the set for $S$, incrementing the value of the first number in each line by one, and appending another copy of the set for $S$ with a [1 + ] inserted at the beginning of each line.

Adding up the total number of permutations with length $k$ reveals that the counts follow Pascal's triangle (the pattern of binomial coefficients): there is always one permutation of length 1, one permutation of length $S$, and in between the classic symmetric pattern generated recursively.

Taking advantage of the recursion and patterns, in formal notation this reduces to:

$$
P_S = \sum_{i = 1}^{S} {\left(\binom{S – 1}{i – 1} × \frac{1}{n^i}\right)}
$$

where $\binom{a}{b} = \frac{a!}{b! × (a – b)!}$

Expanding the sum to a continuous function rather than discrete values works out to $p(S, n) = n^{-S} × (n + 1)^{S – 1}$. (Functions like this are my ultimate goal.)

However, when $S$ is greater than $n$, some of those combinations will contain values that cannot be rolled and which thus must be excluded from the total. For example, if $S = 4$ and $n = 2$, we need to subtract all permutations of values that sum to 4 which include either 3 or 4, since those numbers aren't included on the 2-sided die:

  • $\color{red}{4}$
  • $\color{red}{3 + 1}$
  • $2 + 2$
  • $2 + 1 + 1$
  • $\color{red}{1 + 3}$
  • $1 + 2 + 1$
  • $1 + 1 + 2$
  • $1 + 1 + 1 + 1$

That means subtracting $1 × \frac{1}{2^1}$ and $2 × \frac{1}{2^2}$ from $p(4, 2) = 2^{-4} × (2 + 1)^{4 – 1}$.

This is where I've gotten stuck.

I want to find a function or set of functions which automatically subtract the probabilities of all permutations that include invalid values, rather than having to manually (or programmatically, both of which get slow quickly because of the exponential growth) count how many permutations of each length must be excluded. But I haven't been able to figure out a way to do that.

Is there such a set of functions? If not, can it be proved that there isn't?

This question was inspired by Numberphile's recent video "Pi is Evil".

UPDATE 1

I've made some progress. I figured out that the number of compositions of length $i$ that need to be removed when $n$ is less than $S$ is given by the expression $\frac{i × Γ(S – n)}{Γ(i) Γ(-i + S – n + 1)}$, up to a maximum of $\frac{Γ(S)}{Γ(i) × Γ(S + 1 – i)}$. However, for values of $S$ greater than $2 n$, this removes too much, with the ultimate average tending towards zero instead of what I know to be the correct limit of $\frac{2}{n+1}$. I haven't figured out where the problem lies yet.

UPDATE 1

I realized that $\frac{i × Γ(S – n)}{Γ(i) Γ(-i + S – n + 1)}$ counts, for example, how many 3s are in compositions of length $i$ and not how many compositions of length $i$ contain at least one 3. This means that it double-counts compositions that contain two copies of an invalid number, which only becomes a problem when $S$ goes above $2 n$. When $S$ goes above $3 n$, it is possible for a composition to contain up to three copies of an invalid number and thus be triple-counted, and so on. So I need to find some way to count how many compositions of length $i$ contain at least one 3 specifically.

Best Answer

This can be easily computed using recursion.

Let $P_t$ be the probability that our sum hits the value $t$. Then $P_1=\frac{1}{n}$, and in general, $$P_t= \frac{1}{n} \left( P_{t-1} + P_{t-2} + \cdots + P_{t-n}\right)$$

with $P_0=1$ and $P_t=0$ for $t<0$.

A graph for $n=6$

enter image description here

It converges quickly to $P_{\infty}=\frac{2}{n+1}$

Intuitively: each succesive dice hits one value and leaves untouched $0, 1 ... n-1$: in average, $(n-1)/2$. Hence, in average, the proportion of hitted values is $\frac{1}{1+(n-1)/2}=\frac{2}{n+1}$

A more formal proof, here.