Solved – Modelling using the Binomial Distribution Formula

binomial distributionmodelingmultinomial-distributionprobability

So I'm in my first semester at University studying Actuarial Science and one of my classes is Probability. Needless to say, I have fallen in love with the whole topic and given my passion, I constantly try to solve problems on my own. Except that this time, I really am puzzled.

I first started with a typical American Roulette game at the casino (zero and double zero are possible outcomes). I was interested in finding how many times would the casino need to spin the wheel before being almost sure of turning a profit; assuming there is only 1 bet per spin, the player always bets on red and the bet amount is always \$100.

Well, this problem is not too difficult to solve. Intuitively we know that because the casino has an edge (expected value of 5,26 dollars for \$100 spin), it will beat the player over the long haul and turn a profit. But knowing the player could get lucky, and delay the inevitable (going bankrupt), how long could the "long haul" be? Would it be 10 spins? 50 spins? 100 spins?

As it turns out, I did the model for this projection and noticed that after 10 spins, the casino will only turn a profit 44.32% of the time. After 100 spins, it becomes 66.57% and after 1000 spins it is 94.89%. Conclusion, our intuition was confirmed. Over time, chances are the casino will crush the player.

I was able to do this using the Binomial Distribution Formula below:

Binomial Distribution Formula

Now, what if instead, my new game at the casino was to roll a fixed die. Possible outcomes would include (1,2,3,4,5,6) and their corresponding probabilities would be (10%,15%,20%,15%,10%,30%). Each number has the following profits for the casino (-\$200,-\$100,-\$50,-\$100,-\$200,\$500).

As you can see, the casino only makes money when a 6 is rolled (30% of the time), and although it doesn't win as often as with the Roulette game, its expected value is even higher (\$70 per roll).

How would I find out how many events it would take for the casino to be profitable in this game? Intuitively I'm puzzled because the lower chance of winning tells me it would take much longer than the Roulette game but it's expected value being much higher tells me it would take fewer events.

Can I use a Binomial Distribution once again? For instance, after 5 rolls, the results could have been (5,4,6,3,1) and in that case, the casino would have lost 50 dollars (-\$200,-\$100,\$500,-\$50,-\$200).

It's confusing because unlike the roulette game, the casino has 4 distinct outcomes (-\$200,-\$100,-\$50,\$500). Should I run my model with the binomial formula based on winning 150 dollars, 30 percent of the time and losing 80 dollars, 70 percent of the time? Those figures represent the sum of the expected values. I'm not sure how to solve this.

Best Answer

I will generalise your particular problem to show how to deal with this type of problem. Suppose you have a series of bets, each with $m$ possible outcomes. The probability vector for the outcomes, and the profit to the casino for each outcome, are given respectively by the vectors:

$$\mathbf{p} = (p_1,...,p_m) \quad \quad \quad \mathbf{\pi} = (\pi_1,...,\pi_m).$$

After $n$ rounds of play, the counts of outcomes follow a multinomial distribution. The profit to the casino after $n$ rounds of play is a random variable given by:

$$\Pi_n = \sum_{i=1}^m N_i \pi_i \quad \quad \quad \mathbf{N} = (N_1, ..., N_m) \sim \text{Multinomial}(n, \mathbf{p}).$$

The expected value and variance of the profit to the casino after $n$ rounds is:

$$\begin{equation} \begin{aligned} \mu_n \equiv \mathbb{E}(\Pi_n) &= n \sum_{i=1}^m p_i \pi_i, \\[6pt] \sigma_n^2 \equiv \mathbb{V}(\Pi_n) &= n \sum_{i=1}^m p_i (1-p_i) \pi_i^2. \\[6pt] \end{aligned} \end{equation}$$

Probability of profit: The exact probability that the casino has profited after $n$ rounds is:

$$\mathbb{P}(\Pi_n > 0) = \sum_{\mathbf{n} \in \mathcal{S}_n(\mathbf{\pi})} \text{Multinomial}(\mathbf{n}|n, \mathbf{p}),$$

where $\mathcal{S}_n(\mathbf{\pi}) \equiv \{ \mathbf{n} | \sum_i n_i \pi_i > 0, \sum_i n_i = n \}$ is the set of count vectors that yield profit under the vector $\mathbf{\pi}$. For large $n$ it is computationally expensive to find this set, so it would be usual to approximate the multinomial distribution by the normal distribution to yield the approximation $\Pi_n \sim \text{N}(\mu_n, \sigma_n^2)$, which then gives an approximate probability of profit:

$$\mathbb{P}(\Pi_n > 0) \approx 1 - \Phi \Big( - \frac{\mu_n}{\sigma_n} \Big) = 1 - \Phi \Big( - \frac{\mu_1}{\sigma_1} \cdot \sqrt{n} \Big).$$

This can easily be calculated for any $n$ using the standard normal CDF $\Phi$. For large $n$ it will give a good approximation to the exact probability of profit.


Application to your example: In your case you have the vectors:

$$\begin{equation} \begin{aligned} \mathbf{p} &= (0.10, 0.15, 0.20, 0.15, 0.10, 0.30), \\[6pt] \mathbf{\pi} &= (-200, -100, -50, -100, -200, 500). \\[6pt] \end{aligned} \end{equation}$$

This gives moments $\mu_n = 70 n$ and $\sigma_n^2 = 62650 n$. Hence, the approximate probability of profit after $n$ rounds is:

$$\begin{equation} \begin{aligned} \mathbb{P}(\Pi_n > 0) &\approx 1 - \Phi \Big( - \frac{70n}{\sqrt{62650n}} \Big) \\[6pt] &= 1 - \Phi \Big( - \frac{14}{\sqrt{2506}} \cdot \sqrt{n} \Big) \\[6pt] &= 1 - \Phi ( - 0.2796646 \cdot \sqrt{n} ). \\[6pt] \end{aligned} \end{equation}$$


Coding this in R: You can code this in R to show how the probability of profit changes as we increase $n$. Note that the approximation is poor for small values of $n$ and so the early part of the curve is not an accurate representation of the true probability of profit.

#Generate function for approximate probability of profit
PROB <- function(p, pi, N) { R   <- sum(p*pi)/sqrt(sum(p*(1-p)*pi^2));
                             PPP <- rep(0,N);
                             for (n in 1:N) { PPP[n] <- pnorm(-R*sqrt(n), 
                                                        lower.tail = FALSE); }
                             PPP; }

#Generate example
p  <- c(0.10, 0.15, 0.20, 0.15, 0.10, 0.30);
pi <- c(-200, -100, -50, -100, -200, 500);

#Plot probability of profit for n = 1,...,100
library(ggplot2);
DATA <- data.frame(n = 1:100, PROB = PROB(p, pi, 100));
FIGURE <- ggplot(data = DATA, aes(x = n , y = PROB)) +
          geom_line(size = 1, colour = 'blue') +
          expand_limits(y = c(0,1)) +
          theme(plot.title    = element_text(hjust = 0.5, size = 14, face = 'bold'), 
                plot.subtitle = element_text(hjust = 0.5, size = 10, face = 'bold')) +
          ggtitle('Probability of Profit') +
          labs(subtitle = '(Using normal approximation to the multinomial distribution)') +
          xlab('Number of bets') +
          ylab('Probability of profit');
FIGURE;

enter image description here


N.B. My undergraduate degree was also in actuarial studies, and that is how I came to fall in love with probability and statistics. (So much so that I abandoned actuarial mathematics and became a statistician!) Good luck with your program.