Finding expected number of games played by player 1.

expected value

There are 4 players in the game. First player 1 plays with player 2, with loser exiting the game. Next winner of previous game plays with player 3, with loser exiting. Further, winner of previous game plays with player 4, and the winner of this game is tournament victor. Given that a game between players $i$ and $j$ is won by player $i$ with probability ${i \over {i + j}}$.

Find expected number of games played by player 1, [and player 3]. [EDIT]

Approach: Player 1 if exits in first round. He plays 1 match with probability $ {1\over {1+2}}$ = $1 \over 3$. Similarly, if exits in round two, then plays 2 matches with probability, (wins first round and then second round)${ 1 \over 12} $. Now for playing 3 matches, probability = ${1 \over 60}$.
So, weighted mean or expected value is,
$E(player 1)$ = $1 \times {1 \over 3} + 2 \times {1 \over {12}} + 3 \times {1 \over {60}} = {{11} \over {20}}$.
But solution is incorrect.

Best Answer

In order to play exactly one game he needs to loose the first one which happens with probability $\frac{2}{3}$. To play exactly $2$ games he needs to win the first one and loose the second one, so $\frac{1}{3}\cdot\frac{3}{4}$. Now compute the chance for playing exactly $3$ (where it doesn't matter whether he wins or looses the third game) in a similar way and you should get the correct answer.

For player 3 you can consider the tournament as a whole. There are always 3 games played in total. First game is always between 1 and 2, so player 3 will never play it. Second game is the winner from game 1 against player 3 so he will always play it. To play in game 3 he needs to win game 2. This happens either if player 1 wins game 1 and looses in game 2 or if player 2 wins game 1 and looses in game 2. Compute the probability for these two cases and add 1 for playing game 2 will give you the expected number of games played by player 3.

Setting up equations

Let's think about calculating the expected value of (player A's score) / (turns played) conditioned on A winning the game. To do this I'll use dynamic programming. That's basically a fancy programming way to say I'll write down a recurrence relation and then make my computer plug into the recurrence over and over to get the answer.

Let $T$ be the number of turns played in the game. Let $A$ be A's final score. Define a function $f(a,b,t,x)$ to be the probability of taking $t$ more turns to end up with A winning with final score $x$, starting with A having $a$ points and B having $b$ points. Now we can use Wikipedia's formula for expected value conditioned on an event to rewrite the expected value you asked for:

$$\begin{align} E\left[\frac{A}{T} \bigg| A \text{ wins}\right] &= \sum_{r} r \frac{P[\frac A T = r \text{ and } A \text{ wins}]}{P(A)} \\ &= \frac{\sum_{t=0}^{10} \sum_{x=20}^{24} \frac{x}{t} f(0,0,t,x)}{\sum_{t=0}^{10} \sum_{x=20}^{24} f(0,0,t,x)} \end{align}$$

That formula will let us compute the answers you want if we have a way to compute $f(0,0,t,x)$ for various values of $t$ and $x$. With that in mind, let's think about how to compute various values of $f$.

To compute $f(a,b,t,x)$ I'll use recursion, first computing the values with small values of $t$ and then working my way up. The base cases are:

For $a \ge 20$ and $b < 20$, $f(a,b,0,a) = 1$ because the game is already over and it worked out the way we wanted.
All remaining cases with $t=0$ have $f(a,b,t,x) = 0$ because the game is over but we didn't get the result we wanted.
All remaining cases with $a \ge 20$ and/or $b \ge 20$ have $f(a,b,t,x) = 0$ because someone already won but we didn't get A winning with the correct turn count + score.

Now let's build a recursive formula by thinking about what will happen on the next turn. We'll have $$\begin{align} f(a,b,t,x) &= \sum_{c=1}^5 f(a+c, b, t-1, x) P[A \text{ scores } c \text{ on the next turn}] \\ &\quad \quad + \sum_{c=1}^5 f(a, b+c, t-1, x) P[B \text{ scores } c \text{ on the next turn}] \\ &\quad \quad + f(a, b, t-1, x) P[\text{no points scored on the next turn}] \\ \end{align}$$

The final "pencil and paper" step is to compute the probabilities of each outcome for the next turn. Once we have that, the rules above are actually already enough to compute $f(a,b,x,t)$ for all the values we care about.

There are 36 possible combinations of die rolls on the next turn. There's 1 combination where A scores 5 points ((6,1)), 2 combinations where A scores 4 points ((6,2) or (5,1)), and so on; in general, the chance of A scoring $c$ points is $\frac{6-c}{36}$. The chance of B scoring $c$ points is the same. Finally, the chance of a tie is $\frac 1 6$.

Sample implementation in Python

Here is a sample Python implementation of the above formulas. In the interest of making the code readable, I tried to Python code. I hope it's readable; I just directly implemented the formulas I typed above. Remember that the key idea is to first set up the base cases with $t=0$, then use the recursive equation to solve the cases with $t=1$ by plugging in the already-computed $t=0$ answers, then solve the $t=2$ cases by using the $t=1$ answers, and so on. Anyway, feel free to ask questions about this if you have them.

from fractions import Fraction as frac
f = [[[[frac(0) for x in range(25)]
       for t in range(11)]
      for b in range(25)]
     for a in range(25)]

# base cases.
# note that every value we don't set here will be initialized as 0.
for a in range(20, 25):
    for b in range(20):
        f[a][b][0][a] = frac(1)

# now start applying the recurrence relation.
for t in range(1, 11):
    for a in range(20):
        for b in range(20):
            for x in range(20, 25):
                # First group of terms: cases where A scores points.
                for c in range(1, 6):
                    prob = frac(6-c, 36)
                    ff = f[a+c][b][t-1][x]
                    f[a][b][t][x] += prob * ff
                
                # Next group: B scores
                for c in range(1, 6):
                    prob = frac(6-c, 36)
                    ff = f[a][b+c][t-1][x]
                    f[a][b][t][x] += prob * ff
                
                # finally, count the ties.
                prob = frac(6, 36)
                ff = f[a][b][t-1][x]
                f[a][b][t][x] += prob * ff
                
# OK, done applying the recurrence. Now it's time to read off the answer.
final_numerator = frac(0)
final_denominator = frac(0)
for t in range(1, 11):
    for x in range(20, 25):
        final_numerator += frac(x, t) * f[0][0][t][x]
        final_denominator += f[0][0][t][x]

out = final_numerator / final_denominator

print(final_denominator, final_denominator * 1., sep='\t\t')  # probability that A wins
print(out, out * 1., sep='\t\t')  # E[A's score per round | A wins]

I get a final answer of $$\begin{align} E\left[\frac{A}{T} \bigg| A \text{ wins}\right] &= \boxed{\frac{6480877805871607}{2825235479368820}} \approx 2.2939. \end{align}$$ Just for fun, I also had the program print the probability that A wins, which turns out to be $$P[A \text{ wins}] = \frac{20180253424063}{914039610015744} \approx 2.21\%$$ which means that the probability of a draw is $\approx 100\% - 2 \cdot 2.21\% \approx 95.58\%$.

Checking my work

A program like this is error prone. It could be easy to make a mistake in a math formula, and then also easy to make a typo in the implementation. Fortunately there's an easy way to confirm that we're getting the right answer. The expected value is just the long-run average if we played the game infinitely many times... so we can just simulate the game a million times and compute that average result. This doesn't require much math or tricky programming. The downside is that this answer will only be correct to a couple of decimal places, but that should be plenty to confirm whether the exact result above is correct.

Here's the Python code I used to run $10^6$ simulations and average the results:

import random
NUM_TRIALS = 10**6
TARGET_SCORE = 20
wins = 0
results_sum = 0

for _ in range(NUM_TRIALS):
    # set up for a new game.
    score_a = 0
    score_b = 0
    for rd in range(10):  # play up to 10 rounds
        da = random.randrange(1, 7)  # random integer from 1 to 6
        db = random.randrange(1, 7)  # second die roll
        v = abs(da - db)
        if da > db: score_a += v
        else: score_b += v
        m = max(score_a, score_b)
        if m >= TARGET_SCORE:  # did someone win?
            wins += 1
            results_sum += m / (rd+1)
            break  # if someone won, don't play more turns in this game.

print(results_sum / wins)  # E[winner score per round | somebody wins]
print(wins / NUM_TRIALS)  # chance that somebody wins

When I ran this, I got 2.2933288034777175 for the expected value and 0.044134 for the probability that someone wins. Rerunning the code a few times, I can see that the 2.29 seems correct and the digits after that are different each time so I shouldn't trust them. The 2.29 matches the first decimal places of my exact answer above, so I conclude the full implementation is probably correct.

Version 2 of the problem

I can change my Python code to solve this version instead just by changing one x to a 20. Can you see which one I should change? Hint: with that change, the new expected value is $$\begin{align} E\left[\frac{A}{T} \bigg| A \text{ wins}\right] &= \boxed{\frac{311815273904684}{141261773968441}} \approx 2.20736 \end{align}$$ and the probability of A winning vs B winning vs draw is exactly the same as before.

We can do some quick sanity checks. It makes sense that the probability of A winning is unchanged, because the rule change only affects the winner's final score; it can never change who wins. Also, this rule change will have the effect of sometimes giving the winner a lower final score, so it makes sense that the expected value came out a tad lower this time.

Best Answer

Related Solutions

[Math] Dice game – how do find the expected number of rounds

Expected number of points won per round in dice game for player who wins

Setting up equations

Sample implementation in Python

Checking my work

Version 2 of the problem

Related Question