Self-Study – Mean Amount by Which $M$ is Exceeded When Rolling a 6-Sided Die Until Total $\geq M$

diceself-studystandard deviation

Here is the question:

You roll a fair 6-sided dice iteratively until the sum of the dice rolls is greater than or equal to M. What is the mean and standard deviation of the sum minus M when M=300?

Should I write a code to answer these kind of questions?

Please give me some hints on that. thanks!

Best Answer

You can certainly use code, but I wouldn't simulate.

I'm going to ignore the "minus M" part (you can do that easily enough at the end).

You can compute the probabilities recursively very easily, but the actual answer (to a very high degree of accuracy) can be calculated from simple reasoning.

Let the rolls be $X_1, X_2, ...$. Let $S_t=\sum_{i=1}^t X_i$.

Let $\tau$ be the smallest index where $S_\tau\geq M$.

$P(S_\tau=M)=P(\text{got to }M-6\text{ at }\tau-1\text{ and rolled a }6) \\\qquad\qquad\qquad+ P(\text{got to }M-5\text{ at }\tau-1\text{ and rolled a }5)\\\qquad\qquad\qquad +\:\: \vdots\\\qquad\qquad\qquad+\:P(\text{got to }M-1\text{ at }\tau-1\text{ and rolled a }1)\\ \qquad\qquad\quad=\frac{1}{6}\sum_{j=1}^6 P(S_{\tau-1}=M-j)$

enter image description here

similarly

$P(S_\tau=M+1)=\frac{1}{6}\sum_{j=1}^5 P(S_{\tau-1}=M-j)$

$P(S_\tau=M+2)=\frac{1}{6}\sum_{j=1}^4 P(S_{\tau-1}=M-j)$

$P(S_\tau=M+3)=\frac{1}{6}\sum_{j=1}^3 P(S_{\tau-1}=M-j)$

$P(S_\tau=M+4)=\frac{1}{6}\sum_{j=1}^2 P(S_{\tau-1}=M-j)$

$P(S_\tau=M+5)=\frac{1}{6} P(S_{\tau-1}=M-1)$

Equations similar to the first one above could then (at least in principle) be run back until you hit any of the initial conditions to get an algebraic relationship between the initial conditions and the probabilities we want (which would be tedious and not especially enlightening), or you can construct the corresponding forward equations and run them forward from the initial conditions, which is easy to do numerically (and which is how I checked my answer). However, we can avoid all that.

The points' probabilities are running weighted averages of previous probabilities; these will (geometrically quickly) smooth out any variation in probability from the initial distribution (all probability at point zero in the case of our problem). The

To an approximation (a very accurate one) we can say that $M-6$ to $M-1$ should be almost equally probable at time $\tau-1$ (really close to it), and so from the above we can write down that the probabilities will be very close to being in simple ratios, and since they must be normalized, we can just write down probabilities.

Which is to say, we can see that if the probabilities of starting from $M-6$ to $M-1$ were exactly equal, there are 6 equally likely ways of getting to $M$, 5 of getting to $M+1$, and so on down to 1 way of getting to $M+5$.

That is, the probabilities are in the ratio 6:5:4:3:2:1, and sum to 1, so they're trivial to write down.

Computing it exactly (up to accumulated numerical round off errors) by running the probability recursions forward from zero (I did it in R) gives differences on the order of .Machine$double.eps ($\approx$2.22e-16 on my machine) from the above approximation (which is to say, simple reasoning along the above lines gives effectively exact answers, since they're as close to the answers computed from recursion as we'd expect the exact answers should be).

Here's my code for that (most of it's just initializing the variables, the work is all in one line). The code starts after the first roll (to save me putting in a cell 0, which is a small nuisance to deal with in R); at each step it takes the lowest cell which could be occupied and moves forward by a die roll (spreading the probability of that cell over the next 6 cells):

 p = array(data = 0, dim = 305)
 d6 = rep(1/6,6)
 i6 = 1:6
 p[i6] = d6
 for (i in 1:299) p[i+i6] = p[i+i6] + p[i]*d6

(we could use rollapply (from zoo) to do this more efficiently - or a number of other such functions - but it will be easier to translate if I keep it explicit)

Note that d6 is a discrete probability function over 1 to 6, so the code inside the loop in the last line is constructing running weighted averages of earlier values. It's this relationship that makes the probabilities smooth out (until the last few values we're interested in).

So here's the first 50-odd values (the first 25 values marked with circles). At each $t$, the value on the y-axis represents the probability that accumulated in the hindmost cell before we rolled it forward into the next 6 cells.

enter image description here

As you see it smooths out (to $1/\mu$, the reciprocal of the mean of the number of steps each die roll takes you) quite quickly and stays constant.

And once we hit $M$, those probabilities drop away (because we're not putting the probability for values at $M$ and beyond forward in turn)

enter image description here

So the idea that the values at $M-1$ to $M-6$ should be equally likely because the fluctuations from the initial conditions will get smoothed out is clearly seen to be the case.

Since the reasoning doesn't depend on anything but that $M$ is large enough that the initial conditions wash out so that $M-1$ to $M-6$ are nearly equally probable at time $\tau-1$, the distribution will be essentially the same for any large $M$, as Henry suggested in comments.

In retrospect, Henry's hint (which is also in your question) to work with the sum minus M would save a little effort, but the argument would follow very similar lines. You could proceed by letting $R_t=S_t-M$ and writing similar equations relating $R_0$ to the preceding values, and so on.

From the probability distribution, the mean and the variance of the probabilities are then simple.

Edit: I suppose I should give the asymptotic mean and standard deviation of the final position minus $M$:

The asymptotic mean excess is $\frac{5}{3}$ and the standard deviation is $\frac{2\sqrt 5}{3}$. At $M=300$ this is accurate to a much greater degree than you're likely to care about.