[Math] Why is this coin-flipping probability problem unsolved

binomial distributionprobabilityprobability theorysimulation

You play a game flipping a fair coin. You may stop after any trial, at which point you are paid in dollars
the percentage of heads flipped. So if on the first trial you flip a head, you should stop and earn \$100
because you have 100% heads. If you flip a tail then a head, you could either stop and earn \$50,
or continue on, hoping the ratio will exceed 1/2. This second strategy is superior.

A paper by Medina and Zeilberger (arXiv:0907.0032v2 [math.PR]) says that it is an unsolved
problem to determine if it is better to continue or stop after you have flipped 5 heads in 8 trials: accept \$62.50 or hope for more. It is easy to simulate this problem and it is clear
from even limited experimental data that it is better to continue (perhaps more than 70% chance you'll improve over \$62.50):

alt text

My question is basically: Why is this difficult to prove? Presumably it is not that difficult
to write out an expression for the expectation of exceeding 5/8 in terms of the cumulative binomial distribution.


(5 Dec 2013).
A paper on this topic was just published:

Olle Häggström, Johan Wästlund.
"Rigorous computer analysis of the Chow-Robbins game."
(pre-journal arXiv link).
The American Mathematical Monthly, Vol. 120, No. 10, December 2013.
(Jstor link).
From the Abstract:

"In particular, we confirm that with 5 heads and 3 tails, stopping is optimal."

Best Answer

I accept Qiaochu's answer "Have you tried actually doing that?" I did try now, and now I can appreciate the challenge. :-) The paper I cited refers to another by Chow and Robbins from 1965 that has a beautiful formulation, much clearer than the cummulative binomials with which I struggled. Let me explain it, because it is really cool.

For the natural strategy I mentioned in the comments (and echoed by Raynos), let $f(n,h,t)$ be the expected payoff if you start with $h$ heads and $t$ tails, and let the game continue no more than $n$ further trials. Then there is an easy recursive formulation for $f$: $$f(n,h,t) = \max \left( \frac{1}{2} f(n,h+1,t) + \frac{1}{2} f(n,h,t+1) , h/(h+t) \right) $$ because you have an equal chance of increasing to $h+1$ heads or to $t+1$ tails on the next flip if you continue, and you get the current ratio if you stop. Now, when $h+t=n$, you need to make a "boundary" assumption. Assuming the law of large numbers for large $n$ leads to the reasonable value $\max ( 1/2, h/n )$ in this case. So now all you need to do is fill out the table for all $h$ and $t$ values up to $n=h+t$. The real answer is the limit when $n \rightarrow \infty$, but using large $n$ approximates this limit.

After the Medina and Zeilberger paper was released, in fact just about three weeks ago, a very careful calculation using the above recursive formulation was made by Julian Wiseman and reported on this web page. The conclusion is to me remarkable: "Choosing to continue lowers the expected payoff [from 0.625] to 0.62361957757." This is still not a proof, but the "answer" is now known. So my "it is clear from even limited experimental data that" was completely wrong! I am delighted to learn from my mistake.

Related Question