Variance of classic 100 sided dice game

diceexpected valueprobabilityvariance

We start with the classic 100 sided dice game. You roll a fair 100 sided dice (with sides numbered 1 through 100), and get paid the number you land on, in dollars. If you are unhappy with this result, you can pay one dollar to re-roll, and you can re roll as many times as you like. Let $X$ be a random variable denoting how much money you make in one play of the game. What is $\text{Var}(X)$ under the optimal strategy (if you roll 87 or above, keep it, otherwise re-roll, regardless of what happened in the past)? Note the linked post proves that this is the optimal strategy which maximizes $\mathbb{E}[X]$.

We have $\text{Var}(X)=\mathbb{E}[X^2]-\mathbb{E}[X]^2$?. For the second term, we can compute $\mathbb{E}[X]=87+\frac{5}{14}$ using a recurrence, as shown here. How do we get the first term? Recurrence no longer works because of the squared term. We could compute it using something like
$$\mathbb{E}[X^2]=\sum_{i=-\infty}^{100}\mathbb{P}(X=i)\cdot i^2$$

but this seems messy (at least I don't see a good way to compute $\mathbb{P}(X=i)$ in general). I coded up a simulation and seems like the variance is around 60, but unsure how to compute an exact value in a clean way. Any thoughts?

Best Answer

Here's a solution based on the discussion in the comments for completeness

We have $\text{Var}(X)=\mathbb{E}[X^2]-\mathbb{E}[X]^2$. We know that $\mathbb{E}[X]^2=(87+\frac{5}{14})^2\approx 7631$ (see here for a proof of this).

To get $\mathbb{E}[X^2]$, the trick is to condition on the number of rolls. Note that we can write $X=Y_1+Y_2+\dots+Y_N$ where $Y_i$ is a random variable for the amount of money made in the $i$th roll and $N$ is a random variable for how many rolls until we stop, e.g. roll an 87 or above, according to our optimal strategy (see that link above for a proof of why this is the optimal strategy). Specifically, note that $Y_1,Y_2,\dots,Y_{N-1}$ are all deterministic given $N$ (they must all be $-1$, as they represent re-rolls where each re-roll costs one dollar). However, note $N\sim\text{Geo}(\frac{14}{100})$ and conditioned on $N=n$, $Y_N\sim \text{Unif}\{88-n,89-n,90-n,\dots,101-n\}$. Thus, we can compute (could probably be done by hand, but I just plugged into a calculator)

$$\mathbb{E}[X^2]=\sum_{n=1}^{\infty}\mathbb{E}[X^2|N=n]P_N(n)=\sum_{n=1}^{\infty}\sum_{k=88-n}^{101-n}\frac{k^2}{14}\cdot \left(\frac{14}{100}\right)\cdot\left(\frac{86}{100}\right)^{n-1}\approx 7691$$

Thus we have $$\text{Var}(X)=\mathbb{E}[X^2]-\mathbb{E}[X]^2\approx 7691-7631=60$$

This is consistent with a random sampling simulation.

Conclusion: With optimal play, we expect to make around $\$87.36$, with standard deviation around $\sqrt{60}\approx \$7.75$.