[Math] Very simple gambler’s ruin, Martingale and optional stopping theorem

stopping-times

I've been trying very hard to understand the "simple Martingale + stopping theorem" solution to Gambler's Ruin. Sorry if this is a repeat question, but I really don't understand some of the (seemingly trivial) solutions other people have posted.

Basically, I keep seeing the following happening:

"A gambler enters a casino with $a$ dollars in cash and starts playing a game where they win with probability $\frac{1}{2}$ and lose with probability $\frac{1}{2}$. The gambler plays the game repeatedly, betting 1 dollar in each round. They leave the game if they earn a net gain of $b$ dollars or they lose all of their $a$ dollars, whichever happens first. What is the probability that the gambler will win the game." (See, e.g., https://randomdeterminism.wordpress.com/2010/07/07/gamblers-ruin/)

Now the "super simple martingale solution" goes something like this:

  1. Realize that the gambler's fortune $X_T$ at any time $T$ is a martingale. (Trivial. Understood)
  2. Realize that the martingale is going to stop at some finite time $\tau$. So the optional stopping theorem can be applied. The optional stopping theorem tells me that $\mathbb{E}[X_\tau] = \mathbb{E}[X_0]$. (Also pretty clear. Understood)
  3. Let $\tau_L$ be the smallest time at which the gambler's fortune loses all $a$ dollars and $\tau_W$ be the smallest time at which the gambler's gains a net of $b$ dollars and $\tau = min(\tau_L, \tau_W)$.
  4. Completely out of the blue, everyone goes on to claim that it's just OBVIOUS from steps 2 and 3 that $-aP(\tau=\tau_L) + bP(\tau=\tau_W) = 0$.
  5. Since $P(\tau=\tau_L) + P(\tau=\tau_W) = 1$ (not hard to see), the usual likelihood of winning $\frac{a}{N}$ follows fast.

What I completely fail to see is how to move from steps 2 and 3 to step 4. Where is this formula $-aP(\tau=\tau_L) + bP(\tau=\tau_W) = 0$ coming from? How can this be derived from $E[X_\tau] = E[X_0]$?

As far as I see $E[X_0] = a$, right? That's the value we start with? How is this suddenly expanded into this sum? The only thing that is OBVIOUS to me from this stopping theorem is that $E[X_\tau] = a$. Now trying to expand this expected value, let's say

\begin{align}
E[X_\tau] &= E[\sum_{i=0}^\tau x_i] \\
&= \sum_{i=0}^\tau E[x_i] \\
&= \sum_{i=0}^\tau -1P(x_i=-1) + 1P(x_i=1) \\
&= a
\end{align}

However, this sum could be insanely long. Where's the guarantee that this cancels out to $-bP(\tau=\tau_L) + aP(\tau=\tau_W)$? I'm clearly not seeing where things are going from here.

There are variations on this where people define that the gambler wins at $N$ dollars and $P_N(n)$ as the probability that the gambler wins given that they currently have $n$ dollars and then go on to say it's OBVIOUS from the stopping theorem that $NP_N(n) + 0(1-P_N(n)) = n$, which does not really help. Could anyone clarify this please?

Best Answer

I think the answers you discuss are thinking of $X_n$ as representing the gambler's profit at time $n$, not his actual bankroll. So in this case, we'd have $X_0 = 0$, not $a$.

The next bit is that there are two possibilities at time $\tau$: either the gambler has made his profit of $b$ dollars, in which case $X_\tau = b$, or else he has lost his stake, so his profit is $-a$ dollars and $X_\tau = -a$. These correspond to the events $\tau = \tau_L$ and $\tau = \tau_W$. So the computation of $E[X_\tau]$ is very simple: it is just, by the simplest possible definition of expectation, $$E[X_\tau] = -a P(X_\tau = -a) + b P(X_\tau = b) = - a P(\tau = \tau_L)+b P(\tau = \tau_W) .$$

The optional stopping theorem says this equals $E[X_0]$, which as we noted is $0$ in this setup, so we obtain the equation you're asking about: $$- a P(\tau = \tau_L)+b P(\tau = \tau_W) = 0 \tag{*}$$

If you want to be really formal, you can work in terms of individual outcomes $\omega$. Note that by definition, for any particular $\omega$, $\tau(\omega)$ is either $\tau_L(\omega)$ or $\tau_W(\omega)$. And by the definition of $\tau_L$, $X_{\tau_L(\omega)}(\omega) = -a$. So if $\tau(\omega) = \tau_L(\omega)$ then $X_{\tau(\omega)}(\omega) = -a$. The situation for $\tau_W$ is similar.


If you instead want $X_n$ to represent the gambler's bankroll, that's fine too; it just looks a little different. Now we do have $X_0 = a$ as you suggest. When the gambler loses his stake, we have $X_\tau = 0$; and when he wins, we have $X_\tau = a+b$. This leads, by the same logic as before, to $E[X_\tau] = (a+b) P(\tau = \tau_W) + 0 P(\tau = \tau_L)$, and optional stopping gives the equation $$(a+b) P(\tau = \tau_W) = a. \tag{**}$$ This looks different from (*), but they're actually the same: remember that $\tau$ must equal either $\tau_W$ or $\tau_L$, and therefore $P(\tau = \tau_L) = 1 - P(\tau = \tau_W)$. If you substitute this into (*) and rearrange, you'll obtain (**).

Related Question