# Probability – How Many Rolls Ensure Sum Greater Than 100 with 99% Probability?

diceexpected valueprobabilityrandom variablesvariance

I roll a pair of fair dice $$n$$ times, and calculate the sum of all $$2n$$ faces which come up:

• Suppose each roll of each die is independent of other rolls.
• How many rolls are sufficient to ensure, with probability $$99\ \%$$, that the sum is greater than $$100\ ?$$.

I've calculated that the expected value and variance of sum are
$$\mathbb{E}\left(X\right) = 7n\quad\mbox{and}\quad \mathrm{Var}(X)=\frac{35}{6}n$$
where $$X$$ represents the random variable of sum of the $$2n$$ faces.

I think I need to find a lower bound on $$\operatorname{Pr}\left(X > 100\right)$$. I've attempted applying Markov's inequality:
$$\operatorname{Pr}\left(X \geq 100\right)\leq \frac{\mathbb{E}(X)}{100}\,,$$
which isn't anywhere near what I want. Any tips $$?$$. This may be a very basic question, but I'm not good at probability at all.

Let $$X_1,...,X_k$$ be i.i.d. Uniform($$\{1,...,m\}).$$ Then the distribution of $$S_k=X_1+...+X_k$$ is given exactly by a recursion proved in this paper$$^\dagger$$, which expresses this distribution in terms of polynomial coefficients. However, an approximation based on the CLT, in our case of $$m=6$$, is that $$S_k$$ is approximately Normal with mean $$\mu_k={7 k\over 2}$$ and standard deviation $$\sigma_k=\sqrt{35 k\over 12}$$; so $$P\left(S_k\gt 100\right)\approx 1-\Phi\left({100-\mu_k\over \sigma_k }\right).$$

Here's a picture, for $$k=2n$$:

The least $$n$$ that gives $$P\left(S_{2n}\gt 100\right)\ge 0.99$$ is $$n=18$$, according to both approaches.

Here I've posted my Python code implementing both the exact and approximate approaches, with the following numerical output:

    n=15  P(S>100)=0.6840  normal_approx=0.7035
n=16  P(S>100)=0.8825  normal_approx=0.8929
n=17  P(S>100)=0.9685  normal_approx=0.9718
--> n=18  P(S>100)=0.9938  normal_approx=0.9944 <--
n=19  P(S>100)=0.9991  normal_approx=0.9991
n=20  P(S>100)=0.9999  normal_approx=0.9999


$$^\dagger$$ Caiado, Camila & Rathie, Pushpa. (2007). "Polynomial coefficients and distribution of the sum of discrete uniform variables"

Here is a quick summary of how $$P(S_k=s)$$, and hence $$P(S_k>100)$$, is related to polynomial coefficients via probability generating functions (PGFs):

The PGF for each $$X_i\sim$$ Uniform({$$1,...,m$$}) is just $$G(t):=\sum_{x}P(X=x)t^x={1\over m}(t^1+t^2+...+t^m)={1\over m}t(1+t+...+t^{m-1})$$ so, by the product rule for the PGF of a sum of independent r.v.s, the PGF for $$S_k$$ is simply $$\sum_{s}P(S_k=s) t^s = G(t)^k = {1\over m^k}t^k(1+t+...+t^{m-1})^k.$$ Therefore $$P(S_k=s) = {1\over m^k} \times\text{coefficient of t^s in t^k(1+t+...+t^{m-1})^k}\\ ={1\over m^k} \times\text{coefficient of t^{s-k} in (1+t+...+t^{m-1})^k}$$ which are precisely the coefficients for which a recursion is provided in the cited paper.