[Math] Roll a dice infinite times BUT pay $1 to play – strategy

probability

I am stuck with a question related to the optimal strategy of rolling a dice. Its an extension of the problem "Roll a dice and take either the amount on the dice or re-roll (a maximum of twice)". Heres the link to the basic version and solution:
The expected payoff of a dice game

So the idea is you roll a 6 sided dice, and you can EITHER take the money shown on the dice, OR pay \$$1$ and play again. You can now re-roll as many times as you like. What is the optimal strategy and whats the fair value of the game?

I tried to use the same approach by working backwards as the basic case above, but after a few iterations, it seems to be never ending. This doesn't make sense intuitively, because after 6 rolls, you have spent \$$6$ to play and can only win a maximum of \$$6$ on your next roll, so it would never be optimal to roll more than 6 times!

Best Answer

This answer is longer than it needs to be because I wanted to make a general argument. This will work for any sided die. I could have generalized it slightly more in terms of how much you have to pay to roll again but I got lazy. Feel free to ask clarifying questions!

At every point in the game, your strategy should be irrespective of your previous rolls. To see why this is, think about it like this. Whatever you rolled previously has no effect on you right now. That is, no matter how much you have lost or won, if this current roll you are about to do is profitable, you should do it whether you have already won a million dollars or lost a trillion, because at this current moment in time, it is absolutely profitable. Your past earnings play no roll in the decision.

Our strategy will be to choose a cutoff number above which we will accept the winnings and stop. For this cutoff, we will develop a way to calculate the expected winnings and maximize them.

Let $n$ be the number of sides on the dice and $k$ be the cutoff. We will calculate the expected value of this strategy ($S_k$) as:

$$ E[S_k] = p(x \geq k)\cdot E[ x\vert x \geq k] + (1-p(x\geq k))\cdot \left(E[S_k]-1\right) $$

That is, we have agreed if our roll is greater than the cutoff we will stop and if it is less we will roll again. So if our roll is greater than $k$ (which will happen with $p(x \geq k)$), then we calculate the expected value of the values of $x \geq k$ and multiply this by the probability. If however, our roll is less than the cutoff, then we are back where we started except we have lost a dollar so the expected value is $E[S_k] - 1$.

Rearranging the equation gives us:

$$ \begin{align} E[S_k] &= p(x \geq k)\cdot E[ x\vert x \geq k] + (1-p(x\geq k))\cdot \left(E[S_k]-1\right) \\ E[S_k]\cdot p(x\geq k) &= p(x \geq k)\cdot E[ x\vert x \geq k] - (1-p(x\geq k)) \\ E[S_k] &= \frac{p(x \geq k)\cdot (1 + E[ x \vert x \geq k]) - 1}{p(x\geq k)} \\ E[S_k] &= 1 + E[ x \vert x \geq k] - \frac{1}{p(x\geq k)} \\ \end{align} $$

Let us calculate $p(x \geq k)$ and $E[x \vert \geq k]$. There are $n-k+1$ rolls greater than or equal to $k$ and since there are $n$ possible rolls, we get:

$$ p(x \geq k) = \frac{n + 1 - k}{n} $$

As for $E[x \vert x \geq k]$, the numbers $k, k+1, \ldots n$ are all greater than or equal to $k$ and occur with equal probability. So the expected value of these is just there average i.e.:

$$ E[x \vert x \geq k] = \frac{n + k}{2} $$

Combining these, we finally get:

$$ E[S_k] = 1 + \frac{n + k}{2} - \frac{n}{n + 1 - k} \\ $$

We want to maximize this function. We can do this easily by computing when the derivative is $0$:

$$ \frac{\partial}{\partial k} E[S_k] = \frac{1}{2} - \frac{n}{(n + 1 - k)^2} \\ $$

and so, we compute when this is $0$:

$$ \begin{align} (n + 1 - k)^2 - 2n &= 0\\ k^2 + k(-2 n - 2) + n^2 + 1 &= 0 \\ k &= \frac{2n + 2 \pm \sqrt{(2n+2)^2 - 4(n^2 + 1)}}{2} \\ k &= n + 1 \pm \sqrt{2n} \\ k &= n + 1 - \sqrt{2n} \end{align} $$

So, plugging in your values for $n = 6$, your final answer is $k = 3.53$ so as soon as you roll above $3$ in the game, you should stop :)

Addendum:

Just for completeness, the more general version, where you have to pay $p$ dollars to roll again is given by:

Optimum cutoff: $$ k = n + 1 - \sqrt{2np} $$

In case you wanted some evidence, here are the graphs of average money earned after stopping as soon as a roll of $x$ or higher is seen.

$n = 6$. Note how the optimal value is when $x = 3$ or $x = 4$ are the cutoffs:

enter image description here

$n = 50$. Note how max is at $51 - \sqrt{100} = 41$, exactly as the formula predicts:

enter image description here