[Math] What’s the optimal strategy of this dice game

dicedynamic programmingprobability

I'm working on a dynamic programming problem. I want to find the optimal strategy and simulate it.

The game's description:

The player rolls two dice.

If the numbers shown by the two dice are different then the player will add the sum given ($2+3$ for example) to the cumulative rewards he has.
If the numbers shown by the two dice are equal then the player will loose all his reward.

I started modeling the problem. The state of the system is as follow:
I chose one random number (300 for example) that may be the maximum reward for $N$ rounds game. The state is $V_k(S,D)$ where $S$ is the cumulative sum and $D$ is the sum of the two numbers shown on the two dice.

If we suppose that the numbers of rounds is finite, what will be the optimal strategy in a simple form?

In the case of infinite game what's the mean of the reward?

Reference :
http://people.brandeis.edu/~igusa/Math56aS08/Math56a_S08_notes041.pdf

Thank you in advance guys 🙂

Best Answer

When to stop depends on how many rounds you're allowed to play. If you're allowed to play for arbitrary many rounds, you could decide not to stop until you have a million points. Then with probability $1$ sooner or later (most likely much later) you will leave the table with a million points.

So the game is only interesting when there's a time limit after which you must stop playing and leave with whatever you've got at that time if you haven't cashed out before that.

A strategy is a function $f(N,S)$ that gives our decision to either cash out or keep playing as the output of two inputs: $N$ is the number of rounds left, and $S$ is how many points you have when you make the decision. The fact that you're supposed to use dynamic programming is a hint that you're not supposed to find an explicit formula for $f(N,S)$, but rather to compute a table of values for $f(N,S)$.

It is easy enough to analyze what the optimal choice for $N=1$ is. If you quit now your expected winnings are $S$; if you play the last round your expected winnings are $\frac16 0 + \frac56(S+7)$, (since a roll that doesn't bankrupt you wins you 7 points on average). So you should play the last round iff $\frac {5S}6 + \frac{35}6 \ge S$ which is if $S\le 35$.

In order for you to proceed to $N=2$ you need to compute an auxiliary table of the expected value of the game before you make the $f(1,S)$ for various $S$. For large enough $S$ you should always cash out, in which case the expected value is $S$ itself -- so you only need to tabulate the expected values for $S$ less than the cash-out threshold.

Now you can compute $f(2,S)$ as follows: The expected value of walking out is $S$. To find the expected value of playing, consider the various outcomes of the dice roll: you could go bankrupt or win 3, 4, 5, ..., 10, 11 points, each with a particular probablity. Each of these cases take you to a state $(1,S')$ that you have already computed the expected value of; the appropriately weighted average of those expectations is the value of going on. Compare this with $S$ to figure out if you should leave or go on.

Now do the same thing for $f(3,-)$, $f(4,-)$ and so forth, until you reach the actual length of your game.

(Note that the actual strategic decisions can be made in constant time on the fly if only you have tabulated the values of the game at each possible state).

Related Solutions

[Math] What’s the optimal strategy of this dice game

Here's a Sage session that calculates the expected value and the optimal strategy:

sage: @CachedFunction
sage: def f(s,d):
...       if (s > 500):
...           return s    
...       exp = 0
...       for i in range (2,13):
...           if i != d:
...               p = (6 - abs (7 - i)) / 36
...               exp = exp + p * f (s + i,i)
...       return max (s,exp)    
sage: f (0,0)
5102856371420701910406444770125748115514599426793322429508600411301424025583837413991140684712401545/200826102839256313932089349950528975334638448502151718250240746889056739809428118955117185815543808
sage: _.n ()
25.4093282659829
sage: for i in range (2,13):
...       for s in range (0,500):
...           if f(s,i) == s:
...               print "If you rolled",i,", stop if you have at least",s
...               break
If you rolled 2 , stop if you have at least 250
If you rolled 3 , stop if you have at least 127
If you rolled 4 , stop if you have at least 86
If you rolled 5 , stop if you have at least 66
If you rolled 6 , stop if you have at least 54
If you rolled 7 , stop if you have at least 46
If you rolled 8 , stop if you have at least 53
If you rolled 9 , stop if you have at least 63
If you rolled 10 , stop if you have at least 81
If you rolled 11 , stop if you have at least 119
If you rolled 12 , stop if you have at least 241

Thus the game's value is about $25.4$, and the reward at which to stop depends strongly on the probability of repeating the last roll, and is also slightly higher for the lower of two equiprobable sums (e.g. $2$ and $12$), as might be expected.

[Math] Dice game – how do find the expected number of rounds

I think there is not a general solution. Here is my solution on 4 players, rolling 4-sided dice case. Calculation is pretty complicated. I don't know if there is any better method.

First of all, we need to figure out results of each round of rolling. 4 players roll 4 dice, then we can get:

0 winner: $p_0=\frac{10}{64}$
- AAAA: $p_01=4*\frac{1}{4^4}=\frac{1}{64}$
- AABB: $p_02={4\choose2}\frac{\frac{4*3}{2}}{4^4}=\frac{9}{64}$
1 winner: $p_1=\frac{12}{64}$
- AAAB: $p={4\choose1}\frac{4*3}{4^4}=\frac{12}{64}$
2 winners: $p_2=\frac{36}{64}$
- AABC: $p={4\choose2}\frac{4*3*2}{4^4}=\frac{36}{64}$
3 winners: $p_3=0$
- not possible
4 winners: $p_4=\frac{6}{64}$
- ABCD: $p=\frac{4*3*2*1}{4^4}=\frac{6}{64}$

$E(n)$ is the expected number of rolls to finish a game where $n$ players have not yet won at least once. So we want to know $E(4)$.

$$E(4)=1+p_0*E(4)+p_1*E(3)+p_2*E(2)$$ $$E(3)=1+p_0*E(3)+\frac{1}{4}p_1*E(3)+\frac{3}{4}p_1*E(2)+\frac{1}{2}p_2*E(2)+\frac{1}{2}p_2*E(1)$$ $$E(2)=1+p_0*E(2)+\frac{2}{4}p_1*E(2)+\frac{2}{4}p_1*E(1)+\frac{1}{6}p_2*E(2)+\frac{4}{6}p_2*E(1)$$ $$E(1)=1+p_0*E(1)+\frac{3}{4}p_1*E(1)+\frac{3}{6}p_2*E(1)$$

And I got $E(4)=6.401612$ which is not close to the simulation result. If you find out what's wrong with it please comment. Thanks a lot!

Best Answer

Related Solutions

[Math] What’s the optimal strategy of this dice game

[Math] Dice game – how do find the expected number of rounds

Related Question