[Math] Optimal stopping in red vs black card game deck of 52 cards

card-gamesdecision-theoryordinary differential equationsrecurrence-relationsrecursion

I have a optimal stopping problem that is solved by recursion. I was stumped by this question in an interview once. I am hoping someone can walk me through the reasoning so I can reproduce it on similar problems.

Imagine you are playing a card game you start with a well shuffled deck of $52$ cards, containing $26$ red cards and $26$ black cards, stacked face down. You have a sequence of turns, $52$ possible turns in total, each turn you either pull the top card and turn it over, or you quit. If you pull a red card you win $1$, and if you pull a black card you lose $1$. If you played all $52$ turns without stopping you are guaranteed to break even. What is the optimal stopping strategy?

The solution at the time involved recursion, or a recursion equation. The base cases are that if you know the deck has only red cards remaining (i.e. all $26$ black cards have been pulled) you keep playing until finished. If you know the deck has only black cards remaining, you definitely stop. Working back from this, you get the policy.

But I don't recall the notation for writing this policy/decision rule. I suspect it looks something like a difference or recursion equation, but I'm not sure.

Best Answer

Let $v(r,b)$ be the expected value of the game for the player, assuming optimal play, if the remaining deck has $r$ red cards and $b$ black cards.

Then $v(r,b)$ satisfies the recursion $$ v(r,b) = \begin{cases} 0&\;\text{if}\;r=0\\[4pt] r&\;\text{if}\;b=0\;\text{and}\;r>0\\[4pt] \max(0,f(r,b))&\;\text{if}\;r,b>0\\[4pt] \end{cases} $$ where $$ f(r,b) = \left({\small{\frac{r}{r+b}}}\right)(1+v(r-1,b)) + \left({\small{\frac{b}{r+b}}}\right)(-1+v(r,b-1)) $$ The stopping rule is simple: Stop when $v(r,b)=0$.

To explain the recursion . . .

If $r,b>0$, and the player elects to play a card, then:

  • The revealed card is red with probability ${\large{\frac{r}{r+b}}}$, and in that case, the player gets a score of $+1$, and the new value is $v(r-1,b)$.$\\[4pt]$
  • The revealed card is black with probability ${\large{\frac{b}{r+b}}}$, and in that case, the player gets a score of $-1$, and the new value is $v(r,b-1)$.

Thus, if $r,b>0$, electing to play a card yields the value $f(r,b)$.

But the player always has the option to quit, hence, if $r,b>0$, we get $v(r,b) = \max(0,f(r,b))$.

Implementing the recursion in Maple, the value of the game is $$v(26,26) = \frac{41984711742427}{15997372030584}\approx 2.624475549$$ and the optimal stopping strategy is as follows . . .

  • If $24 \le b \le 26$, play while $r \ge b - 5$.$\\[4pt]$
  • If $17 \le b \le 23$, play while $r \ge b - 4$.$\\[4pt]$
  • If $11 \le b \le 16$, play while $r \ge b - 3$.$\\[4pt]$
  • If $6 \le b \le 10$, play while $r \ge b - 2$.$\\[4pt]$
  • If $3 \le b \le 5$, play while $r \ge b - 1$.$\\[4pt]$
  • If $1 \le b \le 2$, play while $r \ge b$.$\\[4pt]$
  • If $b = 0$, play while $r > 0$.
Related Question