Probability – Optimal Strategy for a Coin Flipping Game

expected valueprobability

Consider a fair coin, tossed 100 times to create a sequence of $H$s and $T$s.

A participant is allowed to ask 1 yes or no question (e.g. was the first coin flip heads?), then plays a game where he tries to guess all 100 coins. The participant is awarded $\$1$ for every coin guessed correctly, and loses $\$1$ for each incorrect guess. Find and prove an optimal strategy for the player.

I have a hunch that the optimal strategy may be to ask "Were there more heads than tails?" and then, depending on the answer, proceed to guess either all $H$s or all $T$s. With this strategy, the player is guaranteed nonnegative earnings, and I believe the expected value is $$\sum_{i=0}^{50}{\binom{100}{i}\left(\frac{1}{2}\right)^{99}(100-2i)} \approx \$7.96$$

I've confirmed the expected value with a Monte-Carlo simulation in Python, but I'm having trouble proving that this is optimal.

My best attempt to translate this into more rigorous mathematics is to consider the yes/no question as a partition. Let $X$ be the set of $2^{100}$ possible sequences and $x$ be the sequence rolled. A yes/no question will always partition the set into two. Suppose that set $A$ is the set of all sequences in which the answer to our question is "yes", then the expected value of our game would be $$E[G] = \frac{|A|}{2^{100}}E[G|x\in A]\space + \left(1-\frac{|A|}{2^{100}}\right)E[G|x \notin A],$$

where G is the expected value of the game, playing with some optimal strategy. I've also made the note that given any specific set $A$, $x \in A$ implies there is an optimal (but not necessarily unique) guess. For instance, if we know that there are more heads than tails, a sequence of 100 $H$s is an optimal guess.

Best Answer

Yes, your construction is optimal! Suppose there are $N$ flips (e.g. $N=100$).

I have two proofs:

  1. A simple logical proof that gives a sufficient condition for optimality.
  2. A more involved mathematical proof that gives necessary and sufficient conditions.

Both of them involve the same setup.


Setup

Denote a guessing strategy $G = (G_{yes}, G_{no})$ where $G_{yes}, G_{no} \in \{ H, T \}^{N}$ characterizes your guess when the answer is "yes" or "no" (irrespective of the question asked). Denote the question $Q$ as a subset of the power set of $\{ H, T \}^{N}$ and the expected payoff by $S(G,Q)$.

Fix any $G$, the optimal question $Q_{G}$ is whether $G_{yes}$ leads to a strictly higher payoff than $G_{no}$? Any other question can only lead to a lower expected payoff by sometimes resulting in the worse guess. (The optimal question is unique up to how ties between the payoff of $G_{yes}$ and $G_{no}$ are included in the question. Here, $Q_{G}$ is constructed so that all ties are answered "no.")

Denote the expected payoff when the optimal question is asked by $S^{*}(G)=S(G,Q_{G})$. Notice it is much easier to choose two sequences of flips ($G$) than to choose any subset of all such flips ($Q$)!

Lemma: $S^{*}(G)$ only depends on the number of flips ($N$) and the number of flips for which $G_{yes}$ and $G_{no}$ differ ($n$).

Proof: Suppose $G_{yes}$ and $G_{no}$ differ for exactly $n$ flips, then any other guess $\hat{G}_{yes}$ and $\hat{G}_{no}$ that differs for exactly $n$ flips can be generated from $G$ by relabelling the sides of each coin flip.


Sufficient Condition

Result: Increasing the number of flips for which $G_{yes}$ and $G_{no}$ differ weakly increases the expected payoff.

Proof: If $G_{yes}$ and $G_{no}$ are the same for flip $k$, then flip $k$ is independent of the answer to question $Q_{G}$ since it doesn't change the relative payoffs. Let $\hat{G} = (\hat{G}_{yes},G_{no})$ where $\hat{G}_{yes}$ is the same as $G_{yes}$ but for flip $k$, then $\hat{G}$ gives the same expected payoff as $G$ when question $Q_{G}$ is asked, therefore: $$S^{*}(G) = S(G,Q_{G}) = S(\hat{G},Q_{G}) \leq S(\hat{G},Q_{\hat{G}}) = S^{*}(\hat{G})$$

(Note that our specific construction of $Q_{G}$ was needed to say flip $k$ is independent of the answer to question $Q_{G}$.) Therefore, a sufficient condition is that the guesses differ on every flip. This may not be necessary because increasing the number of flips for which $G_{yes}$ and $G_{no}$ differ only weakly increases the expected payoff.

Sufficient Condition: $S^{*}(G)$ is maximized if $G_{yes}$ and $G_{no}$ are different for every flip.


Necessary and Sufficient Condition

Lemma: Let $n$ denote number of coin flips for which $G_{yes}$ and $G_{no}$ differ, then:

$$S^{*}(G) = N/2 + \mathbb{E}|X_{n}-n/2|$$

Where $X_{n} \sim \text{Binomial}(n,1/2)$.

Proof: The probability of correctly guessing any coin flip where $G_{yes}$ and $G_{no}$ agree is $1/2$. Consider the $n$ coin flips where $G_{yes}$ and $G_{no}$ are different, if $X_{n}$ of these flips agree with $G_{yes}$ then $n-X_{n}$ flips agree with $G_{no}$, thus:

$$ \begin{align} S^{*}(G) &= (N-n)/2 + \mathbb{E}\left[\max(X_{n}, n-X_{n})\right] \\ &=N/2 + \mathbb{E}|X_{n}-n/2| \end{align}$$


This expression strictly increases when $n$ increases by one from even to odd but is constant when $n$ increases by one from odd to even. (Source: "A Derivation of the Mean Absolute Distance in One-Dimensional Random Walk" by Hižak and Logożar, Tehnički glasnik 2011.) This relationship between $S^{*}$ and $n$ implies the following result:

***Result: If $N$ is odd, $S^{*}(G)$ is maximized if and only if $G_{yes}$ and $G_{no}$ differ for every flip. If $N$ is even, $S^{*}(G)$ is maximized if and only if $G_{yes}$ and $G_{no}$ are the same for at most one flip.


If $N=100$

Your $G_{yes}$ is all heads and $G_{no}$ all tails, so it satisfies this property, and your question is optimal: does $G_{yes}$ give strictly higher payoff than $G_{no}$? In other words, are there strictly more than $50$ heads?

Because $N$ is even, any other pair of guesses that are the same for at most one flip would also correspond to an optimal strategy.

Related Question