Optimal strategy/expected value of red/blue ball game

probability

There are 4 red balls and 3 blue balls in an urn. Picking red balls gives you 1 dollar and picking blue balls means you have to pay 1 dollar. You can stop picking balls at any point.

What is the expected value of this game, given that you play rationally?

This is an interesting question. I've realized that the lower bound on the EV is 1 dollar, because even if you pick all the blue balls, you can just pick all the red balls to get a net of 1 dollar. However, I don't know how to do the case work on the other possible outcomes of drawing balls, and it seem overly complicated.

Best Answer

This is a variant of the somewhat well known red and black card problem. The general question is: if you start with $R$ red balls and $B$ blues and play according to the rules specified above, what is your expected return?

The problem is difficult because of the option. At any stage you have to compare the value of quitting where you stand to the value of drawing one more ball. Such option problems are traditionally handled by backwards induction. The thinking is that you have an enormous amount of information about what to do at the very end and you can start there and work back to the start.

Toward that end, we define all the possible states of the game via in terms of the remaining balls. Thus the state $(r,b)$ means there are $r$ red balls and $b$ blue balls left to draw from. Being in that state means that you have previously drawn $R-r$ red balls and $B-b$ blue ones so the intrinsic value of that state (the value you get if you quit then and there) is given by $$I(r,b)=R-r-(B-b)=(R-B)+(b-r)$$

Of course the states also have option value. Let's denote that by $O(r,b)$. We then let $V(r,b)$ denote the true Value of the state and we have $$V(r,b)=\max \left(I(r,b),\,O(r,b)\right)$$

That definition reflects the "optimal play": At any point you do whatever it takes to maximize your expected value.

Now, we have to compute $O(r,b)$ If you are in the state $(r,b)$ then you can move to either $(r-1,b)$ or $(r,b-1)$ (at least if both $r,b≥1$). The probability of moving to $(r-1,b)$ is $\frac r{r+b}$ and the probability of moving to $(r,b-1)$ is $\frac b{r+b}$. It follows that $$\boxed {O(r,b)=\frac r{r+b}\times V(r-1,b)+\frac b{r+b}\times V(r,b-1)}$$

That is the key recursion.

In your case, for example, we have $R=4,B=3$ so $V(0,0)=1$. We can then use the recursion to compute $V(1,0)$, say. We have $I(1,0)=0$ and $O(1,0)=1\times 1=1$. Thus in this case your best move is to keep on drawing and we get $$V(1,0)=1$$

Similarly $I(0,1)=2$ and $O(0,1)=1\times 1=1$ so this time your best move is to quit and take the $2$ you currently hold. Thus $$V(0,1)=2$$

We could then compute $V(2,0),V(1,1),V(0,2)$ and so on.

The numbers involved in your problem are so small that you can do it with pencil and paper. It's a bit tedious, and somewhat error prone, but it is not difficult. Of course the method lends itself to automation and for larger numbers (like the red and black card problem) you need to do it on a machine.

I did your problem quickly and I got $$V(4,3)=\frac {58}{35}\approx 1.657$$

which seems sensible to me but, as I mentioned, the computation is a bit error prone so I suggest checking it carefully.

Related Question