Solved – Sum of coefficients of multinomial distribution

conditional probabilitydistributionsmultinomial-distributionnormal distributionprobability

$\newcommand{\P}{\mathbb{P}}$I'm throwing a fair die. Whenever I get a 1, 2, or 3, I write down a '1'; whenever I get a 4 I write down a '2'; whenever I get a 5 or a 6, I write down a '3.'

Let $N$ be the total number of throws I need for the product of all the numbers I wrote down to be $\geq 100000$. I want to calculate (or approximate) $\P(N\geq 25)$, and an approximation can be given as a function of the Normal distribution.

First, I know that $\P(N\geq 11) = 1$ because $\log_3 100.000 \approx 10.48$. Now, let $a$, $b$, and $c$ be the number of times I wrote down a 1, 2, and 3, respectively. Then:

$$\P(a,b,c\mid n) = \begin{cases}\displaystyle\binom {n}{a, b, c} \left(\frac 1 2\right) ^ a \left(\frac 1 6\right)^b\left(\frac 1 3\right)^c &\text{ if } a + b + c = n \\ 0 &\text{ otherwise}\end{cases}$$

What I want to calculate is:

$$\P(a + b + c \geq 25 \mid 2^b3^c\geq 100000)$$

How do I calculate this?

–EDIT:

So it was suggested that I could replace the condition with:

$$\P(a + b + c \geq 25 \mid \alpha a + \beta b + \gamma c \geq \delta)$$

where $\alpha = 0$, $\beta = \log 2$, $\gamma = \log 3$, and $\delta = \log 100000$.

This does look more solvable! I unfortunately still have no idea how to solve it.

Best Answer

The present question is a specific case where you are dealing with a quantity that is a linear function of a multinomial random variable. It is possible to solve your problem exactly, by enumerating the multinomial combinations that satisfy the required inequality, and summing the distribution over that range. In the case where $N$ is large this may become computationally infeasible. In this case it is possible to obtain an approximate distribution using the normal approximation to the multinomial. A generalised version of this approximation is shown below, and then this is applied to your specific example.


General approximation problem: Suppose we have a sequence of exchangeable random variables with range $1, 2, ..., m$. For any $n \in \mathbb{N}$ we can form the count vector $\boldsymbol{X} \equiv \boldsymbol{X} (n) \equiv (X_1, X_2, ..., X_m)$, which counts the number of occurences of each outcome in the first $n$ values of the sequence. Since the underlying sequence is exchangeable, the count vector is distributed as:

$$\begin{array} \boldsymbol{X} \text{ ~ Mu}(n, \boldsymbol{\theta}) & & \boldsymbol{\theta} = \lim_{n \rightarrow \infty} \boldsymbol{X}(n)/n. \end{array}$$

Now, suppose we have some vector of non-negative weights $\boldsymbol{w} = (w_1, w_2, ..., w_m)$ and we use these weights to define the linear function:

$$A(n) \equiv \sum_{i=1}^m w_i X_i.$$

Since the weights are non-negative, this new quantity is non-decreasing in $n$. We then define the number $N(a) \equiv \min \{ n \in \mathbb{N} | A(n) \geqslant a \}$, which is the smallest number of observations required to obtain a specified minimum value for our linear function. We want to approximate the distribution of $N(a)$ in the case where this value is (stochastically) large.


Solving the general approximation problem: Firstly, we note that since $A(n)$ is non-decreasing in $n$ (which holds because we have assumed that all the weights are non-negative), we have:

$$\mathbb{P} (N(a) \geqslant n) = \mathbb{P} (N(a) > n - 1) = \mathbb{P} (A(n-1) < a).$$

Hence, the distribution of $N$ is directly related to the distribution of $A$. Assuming that the former quantity is large, we can approximate the distribution of the latter by replacing the discrete random vector $\boldsymbol{X}$ with a continuous approximation from the multivariate normal distribution. This leads to a normal approximation for the linear quantitiy $A(n)$, and we can calculate the moments of this quantity directly. To do this, we use the fact that $\mathbb{E}(X_i) = n \theta_i$, $\mathbb{V}(X_i) = n \theta_i (1 - \theta_i)$ and $\mathbb{C}(X_i, X_j) = -n \theta_i \theta_j$ for $i \neq j$. With some basic algebra, this gives us:

$$\mu \equiv \mathbb{E}\left(\frac{1}{n} A(n)\right) = \sum_{i=1}^m w_i \theta_i,$$

$$\sigma^2 \equiv \mathbb{V}\left(\frac{1}{\sqrt{n}} A(n)\right) = \sum_{i=1}^m w_i \theta_i - \left(\sum_{i=1}^m w_i \theta_i\right)^2 = \mu (1 - \mu).$$

Taking the normal approximation to the multinomial now gives us the approximate distribution $A(n) \text{ ~ N} (n \mu, n \mu (1 - \mu))$. Applying this approximation yields:

$$\mathbb{P} (N(a) \geqslant n) = \mathbb{P} (A(n-1) < a) \approx \Phi \left(\frac{a - (n-1) \mu}{\sqrt{(n-1) \mu (1 - \mu)}}\right).$$

(The symbol $\Phi$ is the standard notation for the standard normal distribution function.) It is possible to apply this approximation to find probabilities pertaining to the quantity $N(a)$ for a specified value of $a$. This is a basic approximation which has not attempted to incorporate continuity correction on the values of the underlying multinomial count values. It is obtained by taking a normal approximation using the same first two central moments as the exact linear function.


Application to your problem: In your problem you have probabilities $\boldsymbol{\theta} = (\tfrac{1}{2}, \tfrac{1}{6}, \tfrac{1}{3})$, weights $\boldsymbol{w} = (0, \ln 2, \ln 3)$, and cut-off value $a = \ln 100000$. You therefore have (rounding to six decimal points) $\mu = \tfrac{1}{6}\ln 2 + \tfrac{1}{3}\ln 3 = 0.481729$. Applying the above approximation we have (rounding to six decimal points):

$$\mathbb{P}(N(a) \geqslant 25) \approx \Phi \left(\frac{\ln 100000 - 24 \cdot 0.481729}{\sqrt{24} \cdot 0.499666}\right) =\Phi (-0.019838) = 0.492086.$$

By application of the exact multinomial distribution, summing over all combinations satisfying the requirement $\mathbb{P}(A(24) < a)$, it can be shown that the exact result is $\mathbb{P}(N(a) \geqslant 25) = 0.483500$. Hence, we can see that the approximation is quite close to the exact answer in the present case.

Hopefully this answer gives you an answer to your specific question, while also placing it within a more general framework of probabilistic results that apply to linear functions of multinomial random vectors. The present method should allow you to obtain approximate solutions to problems of the general type you are facing, allowing for variation in the specific numbers in your example.

Related Question