Get $1/3$ probability from a coin in minimal mean of the amount of flips

probabilityprobability theory

The classic solution for the problem of getting the $1/3$ probability of some event from a coin is to flip a coin two times and do the following:
$$\begin{cases} \text{return } 1, & \text{Heads-Heads} \\ \text{flip again}, &\text{Tails-Tails} \\ \text{return } 0, & \text{otherwise}\end{cases}$$
The probability of returning $1$ is $1/3$ indeed.

Let $\xi$ be the amount of flips taken before a value is returned.
$$\mathbb P(\xi = a) = 2\left(\frac{1}{4}\right)^{a-1}\frac{3}{4}$$, because first $(a-1)$ "double" flips have to be "Tails-Tails", with probability of each being $1/4$, each of them contributing $2$ coin flips, and the last "double" flip had to be not "Tails-Tails".

$$\mathbb E\xi = \sum_{a=1}^\infty P(\xi=a)=\sum_{a=1}^\infty 2a\left(\frac{1}{4}\right)^{a-1}\frac{3}{4} = \frac{8}{3}$$

The question is : is it possible to solve the problem with less mean amount of flips?

Edit: It is possible, but what is the exact minimal mean of the amount of flips?

Best Answer

Yes, your expectation of $\frac83$ flips can be beat. My personal favorite is to repeatedly flip one coin until you get tails; then return success if the total number of flips is even (i.e., if the number of heads is odd). You can see that your probability of doing this is $\frac14$ (HT) $+\frac1{16}$(HHHT)$+\frac1{64}$(HHHHHT)$+\ldots=\frac13$. Then the mean number of flips is just $2$. In fact, a variant on this procedure works for producing a binomial variable with any probability using just two flips on average: if the binary expansion of your probability $p$ is $p=\frac{p_1}{2^1}+\frac{p_2}{2^2}+\ldots$, then flip one coin until you get tails, and return $p_i$ where $i$ is the total number of flips.

Finally, can we prove that 2 is the minimal expected number of flips? In fact, we can. Not just for $\frac13$, but for any probability that's not a dyadic rational (i.e., not of the form $\frac a{2^n}$ for some integer $a$).

First, let me define the rules. A 'coin-flip algorithm' is a binary tree where any leaf nodes are marked either $1$ or $0$. We start at the top node of the tree; if it's not a leaf, then flip a coin. On a result of heads move our pointer to the left child of this node, and on tails move it to the right child. When we get to a leaf node, return the value on that leaf node. For instance, here's the tree corresponding to your 'two flips, then reflip' algorithm (where I've used T and F for the leaf nodes, rather than 1 and 0, to avoid confusion with my ASCII nodes):

      o
  +---+---+
  |       |
  o       o
+-+-+   +-+----+
|   |   |      |
T   F   F      o
           +---+---+
           |       |
           o       o
         +-+-+   +-+-+
         |   |   |   |
         T   F   F   o
                     +
                     .
                     .

Now, we can compute the expected number of flips corresponding to a tree: each 'flip node' (as opposed to leaf node) at level $i$ contributes $2^{-i}$ to the expectation, as it represents one flip, and the probability that you get there is $2^{-i}$. (Note that the first node is considered to be at level 0 for this computation, since you always get there.) For your tree, for instance, we get a contribution of $1$ from the node at the root of the tree; a contribution of $\frac12$ from each the two flip nodes at level 1; a contribution of $\frac1{2^2}$ from the node at level 2, a contribution of $2\cdot\frac1{2^3}$ from the two nodes at level 3, etc. We can evaluate the geometric sum(s) for this, or we can note that if $E$ is the total expectation for number of flips, then $E = 1 + 2\cdot\frac12+\frac1{2^2}E$ (since the tree underneath the tails-tails branch is isomorphic to the overall tree). In other words, $E=2+\frac E4$, or $\frac34E=2$, so $E=\frac83$.

Similarly, we can compute the probability of getting a particular result: it's the sum of $2^{-i}$ over all leaf nodes that output that result, where again $i$ is the level of the node in question. For your tree, you can see that the T result at level two contributes $\dfrac1{2^2}$ to the probability; the T result at level four contributes $\dfrac1{2^4}$; etc, and so the total probability of getting a positive result is $\displaystyle\sum_{i=0}^\infty \frac1{2^{2i}}=\sum_{i=0}^\infty\frac1{4^i}=\frac13$.

Now, note that this implies that any finite coin-flip algorithm can only generate dyadic probabilities, since in that case the probability will be a sum of a finite number of terms of the form $\frac1{2^i}$. So if we want to get an exact $\frac13$ probability, for instance, we have to allow for the possibility that our algorithm might run an arbitrarily long time.

Since our algorithm can't be terminating, then the tree must be infinite; there must be at least one 'flip node' at every level. But a flip node at a given level is always more expensive than a leaf node at that level, since it contributes $2^{-i}$ to the expected number of flips, while a leaf node contributes nothing. So an optimal algorithm will have as few non-leaf nodes as possible at any given level. But the algorithm I outlined above has exactly one non-leaf node at each level; since we know there has to be at least one, this shows its optimality.

Related Question