Expected value of a card game with 6 cards

card-gamesgame theoryoptimizationprobability

A deck contains 3 cards with +1 value and 3 cards with -1 value. The dealer shuffles the cards and deals them face up one by one. After each card is dealt, you have the option of stopping the game. Once the game is stopped, you get paid according to the total value of the cards that were dealt. E.g. if you stop the game after +1, +1, +1, you get +3. What is the expected value of the optimal strategy for this game.

My stab at it:

Since the net sum of the cards is zero, you will never lose any points on each round. If your trailing score is negative, you can keep drawing to get to at least flat.

Strategy: Draw until you get a cumulative score of +1 and stop.

Since there are a total of 20 permutations 6!/(3!*3!) of this game, only 5 sequences will give you a trailing score of 0 or below. So the expected value of this strategy: .75*1 + .25*0 = .75

Strategy 2: Draw two cards, if you have a cumulative score that is greater than 1, stop. Otherwise draw until you get a score of 1 and stop.

I was unable to figure out how to do the math for this strategy, but I did simulate it and apparently it has an expected value of .85. Can someone walk me through the math of the second strategy? If the second strategy is not the optimal one, what is a better strategy?

Best Answer

You can directly compute the optimal strategy and its expected value with a dynamic program (backward induction).

Consider the possible states of the game, which can be completely described by the (number of -1 cards remaining, number of +1 cards remaining), with 16 possibilities.

Arrange them in a square grid as follows (it might be better on paper if you make it a diamond with (3,3) on the left and (0,0) on the right)

$$\begin{array}{ccccccc} \stackrel{(0)}{(3,3)} & \rightarrow & \stackrel{(1)}{(3,2)} & \rightarrow & \stackrel{(2)}{(3,1)} & \rightarrow & \stackrel{(3)}{(3,0)}\\ \downarrow && \downarrow && \downarrow && \downarrow\\ \stackrel{(-1)}{(2,3)} & \rightarrow & \stackrel{(0)}{(2,2)} & \rightarrow & \stackrel{(1)}{(2,1)} & \rightarrow & \stackrel{(2)}{(2,0)}\\ \downarrow && \downarrow && \downarrow && \downarrow\\ \stackrel{(-2)}{(1,3)} & \rightarrow & \stackrel{(-1)}{(1,2)} & \rightarrow & \stackrel{(0)}{(1,1)} & \rightarrow & \stackrel{(1)}{(1,0)}\\ \downarrow && \downarrow && \downarrow && \downarrow\\ \stackrel{(-3)}{(0,3)} & \rightarrow & \stackrel{(-2)}{(0,2)} & \rightarrow & \stackrel{(-1)}{(0,1)} & \rightarrow & \stackrel{(0)}{(0,0)}\\ \end{array} $$ The entries are on top (score if you stop here), bottom (#-1, #+1) remaining in the deck.

The trick is to work backwards from the (0,0) corner and decide at each state whether you want to continue or not. Examples:

There is no decision at (0,0), it is worth 0.
At (0,1) the choice is between taking -1, or drawing a card which gets 0. Since drawing is better, we now know (0,1) is also worth 0.
At (1,0) we take 1 instead of drawing which gets 0.
At (1,1) a real decision comes up. Stopping is worth 0. Drawing gets you 1/2 chance to move to (1,0) [worth 1] and 1/2 chance to move to (0,1) [worth 0]. So drawing is worth 1/2 on average, and it is optimal to do so.

You can continue filling in all the states to find the optimal strategy. Note that unequal card counts matter: at say (1,2), drawing gives you 1/3 chance to move to (0,2) and 2/3 chance to move to (1,1).

The filled in square looks like: $$\begin{array}{ccccccc} \stackrel{17/20}{(3,3)} & \rightarrow & \stackrel{6/5}{(3,2)} & \rightarrow & \stackrel{\mathbf{2}}{(3,1)} & & \stackrel{\mathbf{3}}{(3,0)}\\ \downarrow && \downarrow && &&\\ \stackrel{1/2}{(2,3)} & \rightarrow & \stackrel{2/3}{(2,2)} & \rightarrow & \stackrel{\mathbf{1}}{(2,1)} & \rightarrow^{?} & \stackrel{\mathbf{2}}{(2,0)}\\ \downarrow && \downarrow && \downarrow^{?} &&\\ \stackrel{1/4}{(1,3)} & \rightarrow & \stackrel{1/3}{(1,2)} & \rightarrow & \stackrel{1/2}{(1,1)} & \rightarrow & \stackrel{\mathbf{1}}{(1,0)}\\ \downarrow && \downarrow && \downarrow &&\\ \stackrel{0}{(0,3)} & \rightarrow & \stackrel{0}{(0,2)} & \rightarrow & \stackrel{0}{(0,1)} & \rightarrow & \stackrel{\mathbf{0}}{(0,0)}\\ \end{array}$$

States where you stop have their value bolded. At (2,1) it doesn't matter if you draw or stop.

Since you have made value-maximizing choices at every step including the effects of later choices, Strategy 2 is proven optimal, with value exactly 17/20.

Related Solutions

[Math] Toss a fair die until the cumulative sum is a perfect square-Expected Value

The only time we might consider stopping is when $k$ is nearly a perfect square. Suppose $k \approx n^2$ (more specifically, take $n^2-6\leq k<n^2$).

If we stop, our score is always at least $n^2-6$. If we try to continue past $n^2$, and then stop when we nearly reach $(n+1)^2$, the probability that we actually make it past $n^2$ is at most $\frac{5}{6}$, so the expected score is at best $\frac{5}{6}\left((n+1)^2 - 1\right)=\frac{5}{6}\left(n^2+2n\right)$. So stopping now is better than going through $(n+1)^2$ whenever $n^2-6>\frac{5}{6}(n^2+2n)$, and possibly sooner. This inequality holds whenever $n \geq 13$.

So, if $n$ is at least $13$, stopping is better than going one more round, and going multiple extra rounds isn't ever going to help matters (since the inequality never stops being true, so going one extra round is better than going two extra, which is better than going three extra, and so on). That means we're always better off stopping before we attempt to exceed $13^2=169$; that is, when $k$ is at least $163$.

Since this is finite (and not too large!) it should be straightforward if tedious to figure out the optimal strategy and expected value for all $k<163$.

Some simple Sage code (for exact rational arithmetic) to find the optimal strategy follows:

#!/usr/bin/env sage -python

from sage.all import *

evs_dict = dict([(n, QQ(n)) for n in range(163, 169)])

for n in range(162, -1, -1):
    if sqrt(n) in ZZ and n != 0:
        evs_dict[n] = 0
    else:
        evs_dict[n] = max(QQ('1/6') * sum(evs_dict[n+k] for k in range(1, 7)), n)

for n in range(0, 169):
    ev = evs_dict[n]
    print n, ev if ev in ZZ else ev.n()

It seems that it is optimal to stop when $k \geq 75$ and near a perfect square, or when $k \in \{30, 31, 43, 44, 45, 58, 59, 60, 61, 62 \}$. Also, $f(0) \approx 7.17549931276711$.

The "perfect cube" version of this problem should work similarly.

The version where you lose if you hit a power of $2$ is more interesting. You can check that even in the worst case — where you land on $2^n-6$ — the chance of successfully getting past $2^n$ is larger than $\frac{1}{2}$, and going up to $2^{n+1}$ will roughly double your score if successful. So the expected value of continuing one more round actually exceeds the expected value of stopping — and yet, the expected value of continuing forever is clearly zero! That is, there is no optimal strategy, just a succession of better and better strategies.

Infinite games are weird sometimes...

[Math] Card Matching: expected value of correctly predicted cards with partial feedback

The indicator variable is, as you suspected, for the event that a given card is guessed correctly. I'll call it $X_i$ for convenience:

For $i=1,\ldots,52$, let

$$X_i = \begin{cases} 1 & \qquad\text{if card $i$ is guessed correctly} \\ 0 & \qquad\text{otherwise} \end{cases}$$

Then, using linearity of expectation,

$$E(X) = E\left(\sum_{i=1}^{52}{X_i}\right) = \sum_{i=1}^{52}{E\left(X_i\right)} = \sum_{i=1}^{52}{P\left(X_i=1\right)}.$$

We seek $P(X_i = 1)$ now. If we denote by Y (N) a correct (incorrect) guess for any particular card then the event "$X_i=1$" can be represented by the set of all possible strings of length $i$ of Ys and Ns ending in Y.

Take such a string and consider its substrings of maximally consecutive Ns and the following Y. For example, YNNYNNNNYYNY has $5$ such substrings (because it has $5$ Ys) of lengths $1,3,5,1,2$. We can calculate the probability of this outcome, given the rules of the game, to be:

$$\frac{1}{52}\;\cdot\;\frac{50}{51}\cdot\frac{49}{50}\cdot\frac{1}{49}\;\cdot\;\frac{49}{50}\cdot\frac{48}{49}\cdot\frac{47}{48}\cdot\frac{46}{47}\cdot\frac{1}{46}\;\cdot\;\frac{1}{49}\;\cdot\;\frac{47}{48}\cdot\frac{1}{47}\;\; = \;\;\frac{1}{52}\cdot\frac{1}{51}\cdot\frac{1}{50}\cdot\frac{1}{49}\cdot\frac{1}{48}$$

$$\\$$ Hopefully, it's not hard to convince yourself of the pattern here that the probability of any given Y-N string is $\dfrac{(52-k)!}{52!}$ where $k$ is the number of Ys in the string.

Now, of all Y-N strings of length $i$ ending in Y, there are $\binom{i-1}{k-1}$ strings with exactly $k$ Ys since we need to choose $k-1$ positions from $i-1$ possibilities for all Ys but the last. And since this $k$ can range from $1$ to $i$, we have,

\begin{eqnarray*} P(X_i = 1) &=& \sum_{k=1}^{i}{\binom{i-1}{k-1} \dfrac{(52-k)!}{52!}}. \end{eqnarray*}

So, \begin{eqnarray*} E(X) &=& \sum_{i=1}^{52}{\left[ \sum_{k=1}^{i}{\binom{i-1}{k-1} \dfrac{(52-k)!}{52!}} \right]} \\ && \\ &=& \sum_{k=1}^{52}{\left[ \dfrac{(52-k)!}{52!} \sum_{i=k}^{52}{\binom{i-1}{k-1}} \right]} \qquad\text{(swapping the order of summation)} \\ && \\ &=& \sum_{k=1}^{52}{\left[ \dfrac{(52-k)!}{52!} \binom{52}{k} \right]} \qquad\text{(using identity $\sum\limits_{r=p}^{q}{\binom{r}{p}} = \binom{q+1}{p+1}$)} \\ && \\ &=& \sum_{k=1}^{52}{\dfrac{1}{k!}} \\ && \\ \end{eqnarray*}

Note now the Taylor expansion: $\;\;e^x = 1 + \frac{x}{1!} + \frac{x^2}{2!} + \frac{x^3}{3!} + \cdots$.

So, $\;e-1 \;=\; \frac{1}{1!} + \frac{1}{2!} + \frac{1}{3!} + \cdots\; = \;\sum\limits_{k=1}^{\infty}{\frac{1}{k!}}$.

We can see $E(X)$ is a partial sum of this series and is extremely close to $e-1$.

Best Answer

Related Solutions

[Math] Toss a fair die until the cumulative sum is a perfect square-Expected Value

[Math] Card Matching: expected value of correctly predicted cards with partial feedback

Related Question