[Math] Wald’s equation example controversy

markov chainsprobability theorystochastic-processesstopping-times

I'm trying to get a grip of Wald's equation, applying it to the following example.

Suppose, we have a simple sequence of fair coin flips, where heads wins us a dollar, while tails means loss of a dollar: $$\mathbb{P}(X_i=1)=\frac{1}{2}, \mathbb{P}(X_i=-1)=\frac{1}{2}$$

Suppose, that we're planning to gamble, tossing the coin until we win 3 dollars, that's our condition on stopping time N:

$$S_N = \sum_{i=1}^{N}X_N = 3$$

According to Wald's equation $$E(S_N) = E(X_i) \cdot E(N)$$

As we know, expectation of our fortune at stopping time is $E(S_N) = 3$, expectation of a fair coin is zero: $E(X_i) = 0$, so I thought that Expectation of the stopping time $E(N)$ should grow to infinity. But seemingly it doesn't.

Our process is described by the following Markov chain:

$$
\begin{pmatrix}
\mathbb{P}(S_{i+1}=2) \\
\mathbb{P}(S_{i+1}=1) \\
\mathbb{P}(S_{i+1}=0) \\
\mathbb{P}(S_{i+1}=-1) \\
\mathbb{P}(S_{i+1}=-2) \\
\mathbb{P}(S_{i+1}=-3) \\
\dots
\end{pmatrix}
=
\begin{pmatrix}
0 & 0.5 & 0 & 0 & 0 & 0 & \dots\\
0.5 & 0 & 0.5 & 0 & 0 & 0 & \dots\\
0 & 0.5 & 0 & 0.5 & 0 & 0 & \dots\\
0 & 0 & 0.5 & 0 & 0.5 & 0 & \dots\\
0 & 0 & 0 & 0.5 & 0 & 0.5 & \dots\\
0 & 0 & 0 & 0 & 0.5 & 0 & \dots\\
0 & 0 & 0 & 0 & 0 & 0.5 & \dots\\
\dots & \dots & \dots & \dots & \dots & \dots & \dots
\end{pmatrix} \cdot
\begin{pmatrix}
\mathbb{P}(S_i=2) \\
\mathbb{P}(S_i=1) \\
\mathbb{P}(S_i=0) \\
\mathbb{P}(S_i=-1) \\
\mathbb{P}(S_i=-2) \\
\mathbb{P}(S_i=-3) \\
\dots
\end{pmatrix}
$$

Here each vector of $S_i(x)$ is infinite ($x \in (-\infty, 2]$, $x \in \mathbb{Z}$), but we can set a highly improbable lower bound (say, -20$) and the resulting 23×23 matrix will approximate our process well. We will calculate the Eigenvalues and Eigenvectors of that matrix to calculate the expected stopping time.

The probability of our fortune to be e.g. in state -1 dollar at step $i$ is approximated by $$\mathbb{P}(S_i = -1) = C \cdot \lambda^i \cdot V(-1)$$

where $\lambda$ is the main eigenvalue, $C$ is its quotient in eigenvalue decomposition and $V(-1)$ is the coordinate of main eigenvector, corresponding to $\mathbb{P}(S_i = -1)$.

The probability of stopping at the moment of time $i$ is $\mathbb{P}(S_{i}=3) = \mathbb{P}(S_{i-1}=2) \cdot 0.5$, thus expectation of stopping time

$$E(N) \approx (1 + 2 \cdot \lambda + 3 \cdot \lambda^2 + 4 \cdot \lambda^3 + \dots) \cdot C \cdot V(2) \cdot 0.5 = \frac{d(\lambda + \lambda^2 + \lambda^3 + \dots)}{d\lambda} \cdot C \cdot V(2) \cdot 0.5 = \frac{d(\frac{\lambda}{1-\lambda})}{d\lambda} \cdot C \cdot V(2) \cdot 0.5 = \frac{1}{(1-\lambda)^2} \cdot C \cdot V(2) \cdot 0.5 < \infty$$

So, $E(S_N) = E(X_i) \cdot E(N)$ means 3 = 0 * C, which is wrong. Can you see, what's wrong?

Best Answer

Essentially your problem is the assumption that a large lower bound is "highly improbable". In fact, no matter how low you set your lower bound, the random walk will almost surely (with probability $1$) hit either your lower bound or your upper bound, and the probability that it will hit the lower bound first does not go to zero very fast. Suppose the distance between the bounds is some large $N$, and you start at a fixed distance $d$ from the upper bound. The probability of hitting the lower bound first is $$ P_{\text{low}}(d)=\frac{1}{2}P_{\text{low}}(d+1)+\frac{1}{2}P_{\text{low}}(d-1), $$ with boundary conditions $P_{\text{low}}(0)=0$ and $P_{\text{low}}(N)=1$. The solution is just $P_{\text{low}}(d)=d/N$; in particular, $P_{\text{low}}(N/2)=1/2$, which was already obvious by symmetry. We can use this to prove that the expected stopping time is infinite. Imagine starting at $x=0$. The probability of reaching $x=N$ (the stopping position) before $x=-N$ is $1/2$, so the expected time to reach $x=N$ satisfies $S_{N} \ge N/2 + (N + S_{2N})/2=N+S_{2N}/2$ (given that we reached $x=\pm N$ first, we couldn't have done so in less than $N$ steps). But this can be repeated: $S_{2N}/2\ge N + S_{4N}/4$, and $S_{4N}/4 \ge N + S_{8N}/8$, and so on. Terminating after $k$ applications gives $S_{N}\ge k N$; since this holds for any $k$, we conclude that $S_{N}=\infty$.