[Math] 30 sided die and 20 sided die

diceprobabilityprobability distributionsprobability theoryproblem solving

Person A has a 30 sided die and person B has a 20 sided die . Both players role and the person with the highest role wins (on a draw B wins). The loser pays the winner the value of the winners die.

  1. Calculate expected value of this game for player A.

  2. How does this expected value change if player B can re-roll and when should he re-roll. Calculate the new expected value for player A if player B can re-roll.

  3. Now calculate how much it's worth for player A to get a re-roll option in this scenario.

  4. Remove player A re-roll. How many re-rolls does player B need in order for him to be a favorite in the game.

For part 1, I did the following: Since Person A will win with 100% probability if he rolls a number greater than 20, with an average payout of 25.5, we get: (10/30)(20/20)(25.5) = 8.5 as the payout in that case. Now, we consider the problem as 2 20-sided die's, as we've already accounted for all scenarios where Player A rolls above a 20. Person A will lose if both Person A and B roll the same number (all other scenarios will cancel out due to symmetry), so the expected payout of Person A in this case is: (20)(1/20)(1/30)(-10.5) = -0.35 (Note: there are 20 different numbers, and the probability that both A and B will roll the same number is (1/20)(1/30), and the average payout for numbers from 1 through 20 is 10.5, which A will have to pay so -10.5).

So, the expected value of this game for player A is 8.5-0.35=8.15

I don't understand how to do part 2, 3 and 4. Answer for part 2 should be 5.4275, and the strategy of player B should be to re-roll if he gets anything less than or equal to an 11. I don't understand why this is the case either, as the expected value of a 20-sided die is 10.5

Best Answer

Here's an outline for part 2, to get you started. I leave the actual calculations up to you for now. In the end, I throw in some ideas for parts 3 and 4.

It is a good idea here to look at the expected pay-off, given that B has rolled a particular number - after all, that's what he's basing his decision on. I assume that he has to decide whether to re-roll while he still doesn't know what A rolls (otherwise the strategy would be simple - re-roll if you have less than A's number).

So, what is $E(X|B = b)$? $X$ here will refer to A's pay-off, as in Jean Marie's comment, and $B$ is the variable that holds the result of B's (first) roll. If B has rolled $b$, then there is $b/30$ probability he will gain the amount $b$ (so then $X = -b$) and $1/30$ probability that he will lose an amount $a$ for every number $a > b$. So $$ E(X|B = b) = \frac{b}{30}\cdot (-b) + \sum_{a = b+1}^{30} \frac{1}{30}\cdot (a) $$ Once you evaluate that, you will have an expression for $E(X|B = b)$ in terms of $b$. If you average those up for $b = 1,2,\dots,20$ you will get the unconditioned expectation $E(X)$ which you already know to be $8.15$. But what we need now is this - which of these are higher than the unconditional? This is how B makes his decision of whether to re-roll. If he re-rolls, he can expect a loss of $E(X)$, the unconditional, because he doesn't know what the re-roll will produce. If he doesn't re-roll, he can expect a loss of $E(X|B=b)$, where $b$ is what his first roll produced. He should re-roll if the former is preferable, i.e. lower since we are talking about losses.

You should find that a re-roll is preferable if $b \leq 11$. Then, to get the new expected loss, with the re-roll option, in the averaging process mentioned above, we should replace the expectation for every outcome of $11$ or less with the unconditional $E(X)$ that the re-roll is equivalent to: $$ E^{(1)}(X) = \frac{1}{20} \left\{\sum_{b=1}^{11}E(X) + \sum_{b=12}^{20}E(X|B=b) \right\} $$ where the superscript $^{(1)}$ indicates one re-roll option.


For part 4 (which is the more straightforward one), you need to just generalize the approach from part 2. After every roll, B has to decide if what he can expect with the score he has from that roll is better than what he can expect if he re-rolls. But note that the re-roll itself now comes with some number of (re-)re-roll options, so the strategy is different on each roll - in deciding whether to re-roll the first, he should be more demanding than on the ones that come after. The next-to-last corresponds to having the only one re-roll option left, so that's exactly like part 2: re-roll on 11 and under.


For part 3, I'm not sure what "in this scenario" refers to. I take it to mean that B has his re-roll as in part 2; then how would the expected pay-off change if player A also gets a re-roll option? But here's an important detail: are the players aware of the fact that the opponent has a re-roll option? This is crucial, because having a re-roll changes the probability distribution of the outcomes for each. Without re-rolls, every number is equally likely; with re-rolls, lower numbers are less likely. Since the probability distribution of the opponent's scores enters into the calculation of each player's conditional and unconditional expected pay-offs (think of how $1/30$ played a part in calculating $E(X|B = b)$ in part 2), it also affects their strategies. So player A's re-roll option causes player B to exercise the re-roll option more often, which changes his outcome distribution, which affects player A's calculations, changing his strategy, which in turn affects player B's calculations, changing his strategy, etc. It becomes a game theory problem... I'm not sure that this was the intention.


UPDATE for part 3:

Suppose first that A is given a re-roll option, but B is not made aware of that. A, however, knows about B's re-roll option. So in calculating his expected gains, he will assume B's optimal strategy discovered in part 2 - which is to re-roll 11s and smaller. This means that the probability distribution for B's outcomes is now $P(B = y \leq 11) = 0.55 \cdot 0.05$ (11 or less on the first roll, then the particular $y$ on the re-roll) and $P(B = y > 11) = 0.05 + 0.55 \cdot 0.05$ ($y$ on the first roll, or anything 11 or less on the first roll and $y$ on the re-roll). More generally, if the cut-off in B's strategy is $b$, then $P(B = y \leq b) = b/20^2$ and $P(B = y > b) = (b+20)/20^2$.

Now from A's point of view, if he rolls $a$, where we'll focus on the case $b < a \leq 20$, $$ E_{0,b}(X|A = a) = b\cdot \frac{b}{20^2} \cdot (a) + (a-1-b)\cdot\frac{b+20}{20^2}\cdot (a) + \sum_{j=a}^{20} \frac{b+20}{20^2}\cdot (-j) $$

(there are $b$ numbers from $1$ to $b$ with probability of $b/20^2$ each, leading to a gain of $a$, $(a-1-b)$ numbers from $b+1$ to $a-1$ with probability of $(b+20)/20^2$ each, also with gain of $a$, and finally for the rest, each with probability of $(b+20)/20^2$, we need to sum losses equal to their values). The subscripts on $E$ in this and subsequent expressions specify the assumed cut-offs in each player's strategies, which define their outcomes' distribution - in this case $b$ for player B, while the $0$ for A means "not applicable" (since we are considering the expectation conditional on $A = a$ so A's distribution is irrelevant); later on, $0$ can also mean "no re-roll option". For this and other sums, Wolfram can be quite handy in saving some time. We want to know for which values of $a$ this is less than the $5.4275$ that A can expect if he chooses to re-roll. Solving that inequality (again, Wolfram) with $b=11$ yields $a < 16.8262$ so A should re-roll on $16$ or less. To find the actual expected value $E_{16,11}(X)$, some more work is needed, because the expression above is only valid for $b < a \leq 20$. See below for an expression for the unconditional $E_{a,b}(X)$ that is valid on the same range. If I substitute $16, 11$ in that, I get $12.9055$.


Now for the game-theory interpretation. We can arrive at the Nash equilibrium iteratively. The "infinite loop" turns out to be not infinite here, because the strategy sets are discrete, and indeed it is quite a short loop: once B is made aware of A's re-roll, he assumes a cut-off of $16$ for A, recalculates his expectations based on that, finds that he should change his cut-off to $12$. Then, if A knows that B will do this (as he should, because all the information about re-roll options is now public), he will also see if his strategy needs to be adjusted; he finds that it doesn't, so both players are now content that they have the optimal strategy.

To do B's re-analysis, we need two more things: an expression for his expected loss, given that he has rolled $b$, knowing that A re-rolls on outcomes of $a$ or less: $$ \begin{align} E_{a,0}(X|B=b\leq a) &= b\cdot\frac{a}{30^2}\cdot(-b) + \sum_{k=b+1}^a \frac{a}{30^2}\cdot(k) + \sum_{k=a+1}^{30}\frac{a+30}{30^2}\cdot(k) \\ &= \frac{-30a^2-3ab^2-ab+900a+27900}{2\cdot 30^2}. \end{align} $$ This is valid for $b \leq a$ which includes the subset considered above (the $0$ subscript means cut-off for B is "not applicable"). We also need something to compare these to - an expression for the unconditional $E_{a,0}(X)$ (here the $0$ subscript for B means "no re-roll" because if he chooses to re-roll now, he will not have a re-roll option on the re-roll); this is not too hard to obtain in the same manner as the $E_{0,11}(X)$ was found in part 2. For $a=16$ I get $14.69\bar{6}$. Then solving $E_{16,0}(X|B=b) > E_{16,0}(X)$ produces $b < 12.8776$ so we adjust to re-roll on $12$ or less. Now go back and do the first step again, but with $b=12$ instead of $b=11$ and see that the answer is still less than $17$.

If we were expecting many more re-adjustments, a more direct approach would be more efficient. Without specifying any particular values for the cut-offs from the beginning, find a general expression for the expected gain $E_{a,b}(X)$. This is a rather unwieldy sum, especially if you want to consider all possible orderings of the numbers $\{a,b,20\}$. In general, $$ E_{a,b}(X) = \sum_{j=1}^{20} P_b(B=j) E_{a,0}(X|B=j) $$ where I've chosen to break it down into a sum over the possible outcomes for B, each with its own probability. But so far we only have an expression for $E_{a,0}(X|B=j)$ that is valid for $j \leq a$ - this is the expression we used in B's re-analysis, with $b$ replaced by $j$. We now also need one which will hold for $a < j \leq 20$. This is $$ \begin{align} E_{a,0}(X|B=j>a) &= a\cdot \frac{a}{30^2}\cdot(-j) + (j-a)\cdot\frac{a+30}{30^2}\cdot(-j) + \sum_{k=j+1}^{30}\frac{a+30}{30^2}\cdot(k) \\ &= \frac{-3aj^2+59aj+930a-90j^2-30j+27900}{2\cdot 30^2}. \end{align} $$

When you put it all together, for the case $b \leq a \leq 20$ (which is of particular interest to us as we have good reason to expect it to be the one containing the Nash equilibrium), the sums breaks down like this: $$ \begin{align} E_{a,b}(X) &= \frac{b}{20^2}\sum_{j=1}^b \frac{-30a^2-3aj^2-aj+900a+27900}{2\cdot 30^2} +\\ &+\frac{b+20}{20^2}\sum_{j=b+1}^a \frac{-30a^2-3aj^2-aj+900a+27900}{2\cdot 30^2} +\\ &+\frac{b+20}{20^2}\sum_{j=a+1}^{20} \frac{-3aj^2+59aj+930a-90j^2-30j+27900}{2\cdot 30^2} \\ \end{align} $$ Let Wolfram simplify this for you (or if you have the patience, do it yourself), and now you have a function. Player A wants to maximize it, given a particular value of $b$. Player B wants to minimize it, given a particular value of $a$. If there is a point that satisfies both, it is a Nash equilibrium. In calculus terms, we are looking for a saddle point (Wolfram can do that for you, too). It will not have integer coordinates, but you can manually check the four corners around it.

Finally, $(16, 12)$ is the Nash equilibrium in question and the expected gain then is $12.876$.

Related Question