Probability – Probability of a Random Walk Crossing a Straight Line

catalan-numbersco.combinatoricspr.probabilityrandom walks

Let $(S_n)_{n=1}^{\infty}$ be a standard random walk with $S_n = \sum_{i=1}^n X_i$ and $\mathbb{P}(X_i = \pm 1) = \frac{1}{2}$. Let $\alpha \in \mathbb{R}$ be some constant. I would like to know the value of

$$\mathcal{P}(\alpha) := \mathbb{P}\left(\exists \ n \in \mathbb{N}: S_n > \alpha n\right)$$

In other words, I am interested in the probability that a random walk $(S_n)_{n=1}^{\infty}$ crosses the straight line through the origin with slope $\alpha$.

Since the standard random walk is recurrent, it follows that $\mathcal{P}(\alpha) = 1$ for $\alpha \leq 0$, while obviously $\mathcal{P}(\alpha) = 0$ for $\alpha \geq 1$. Hence the non-trivial part and the part I am interested in is the region $\alpha \in (0,1)$. For this region we know that $\mathbb{P}(S_1 > \alpha) = \frac{1}{2}$, hence $\mathcal{P}(\alpha) \geq \frac{1}{2}$, but finding an exact value seems difficult.

Note: One way to explicitly calculate $\mathcal{P}(0)$ is by

$$\mathcal{P}(0) = \sum_{n=1}^{\infty} \frac{C_n}{2^{2n-1}} = 1$$

where $C_n$ is the $n$-th Catalan number, as was pointed out on e.g. http://oeis.org/A000108 by Geoffrey Critzer. For general $\alpha$ however such a summation does not seem to give a nice expression, since the coefficients are uglier and the exponents of $2$ depend on $\alpha$ (for $\alpha = 0$ one gets exponents $2n – 1$, but for irrational $\alpha$ this exponent becomes something ugly). But finding a closed form for these coefficients for general $\alpha$ might also help solve this problem.

Edit: As it may be too much to ask for a nice formula for $\mathcal{P}(\alpha)$, I would also be very happy if someone could provide (good) bounds on or approximations of the value of $\mathcal{P}(\alpha)$. For example: Does $\mathcal{P}(\alpha)$ decrease linearly in $\alpha$? Any insight is very much appreciated!

Best Answer

I wrote a Maple program in 2003 which computes $P(\alpha)$ (both the "upper" and "lower" values) for a given rational number $\alpha$, much in line with the method described by Johan. I think I was able to compute it for all $\alpha=a/b$ for integers $1\le a\lt b\le 300$ in a couple of hours, so the program could probably compute fairly good approximations to $P(1-2/\log_2(p))$ when $p$ is not too large. (Edit: When $p$ is large, $1/2$ will be a very good approximation. For example, if $p\gt1000$ then $0.499\lt P(1-2/\log_2(p))\lt1/2$.)

I didn't publish any of this, but much of it is contained in the article "The Maximum Average Gain in a Sequence of Bernoulli Games" by Wolfgang Stadje (American Mathematical Monthly, December 2008).

EDIT: Since I don't have access to Maple anymore, I translated the program to Matlab:

function p=p(s,t);
d=gcd(s+t,2*t);
s=(s+t)/d;
t=2*t/d;
pol=zeros(1,t+1);
pol([1 s+1 t+1])=[1 -2 1];
r=roots(pol);
[rsort,i]=sort(abs(r));
p=1-prod(abs(1-r(i(1:t-s))));

The program computes $P(s/t)$ for integers $0\lt s\lt t$, by computing $1$ minus the product of $1-r$ for those zeroes $r$ of $z^{2t/d}-2z^{(t-s)/d}+1$ that have absolute value less than $1$ (there are $(t-s)/d$ of them), where $d=\gcd(s+t,2t)$.

By the way, in my notes I found the lower bound $P(\alpha)\ge A$, where $\alpha=1-\frac{2\log A}{\log(2A-1)}$, with equality for $\alpha=(k-2)/k$. I think Johan was involved in obtaining this bound.

EDIT (May 14): So, here is a fairly detailed proof of the formula used in the program above.

Lemma 1. For integers $0\lt 2b\lt a$, the polynomial $g(z)=z^a-2z^b+1$ has no multiple zeros and exactly $b$ zeroes inside the unit circle.

Proof. If $g(z)=g'(z)=0$ then $z\ne 0$ and $$0=g'(z)=az^{a-1}-2bz^{b-1}=z^{b-1}(az^{a-b}-2b),$$ so $az^{a-b}-2b=0$. Hence, $0=g(z)-g'(z)*z/a=1-2(1-b/a)z^b$ so that $|z|^b=\frac1{2(1-b/a)}=\frac a{2(a-b)}$. Also, $az^{a-b}-2b=0$ so $|z|^{a-b}=2b/a$. This implies that $$\left(\frac a{2(a-b)}\right)^{a-b}=\left(\frac{2b}a\right)^b.$$ This can be rewritten as $2(1-b/a)^{1-b/a}(b/a)^{b/a}=1$. But it is easily checked that $2(1-y)^{1-y}y^y\gt1$ for $0\lt y\lt1/2$, a contradiction.

The second part can be proved by a straight forward application of Rouché's theorem.

Lemma 2. Suppose that $\sum_{i=1}^kA_ir_i^{-j}=1$ for $1\le j\le k$. Then $\sum_{i=1}^kA_i=1-\prod_{i=1}^k(1-r_i)$.

Proof. The equations can be seen as a system of linear equations in $A_1,\dots,A_k$, and the result can be obtained by using Cramer's rule and Vandermonde determinants. I skip the details. (Actually, I think I had a simpler proof of this lemma, but I could neither find it in my notes, nor figure it out right now.)

Now, let $L=\max_{n\gt0}{S_n/n}$, so that $P(\alpha)=P(L\gt\alpha)$. Also, let $Y_i=(X_i+1)/2$ and $T_n=\sum_{i=1}^nY_i$ (so that $T_n$ is a random walk with steps $0$ and $1$ instead of $\pm 1$). (The main reason for this is that my original result was for $T_n$, but I also think that the formulae get a little simpler in this case.)

Let me restate the result I am going to prove:

Theorem. For integers $0\lt s\lt t$, $P(s/t)=1-\prod_{i=1}^{t-s}(1-r_i)$, where $r_1,\dots,r_{t-s}$ are the zeroes of $z^{2t}-2z^{(t-s)}+1$ that have absolute value less than $1$.

Proof. Let $M=\max_{n\gt0}{T_n/n}$ and $Q(\alpha)=P(M\gt\alpha)$. I will prove the corresponding result for $Q(s/t)$, and then translate it to $P$, using the fact that $T_n=(S_n+n)/2$ which implies that $P(\alpha)=Q((\alpha+1)/2)$.

Suppose then that $1/2\lt s/t\lt1$ and consider $Q(s/t)$. Just as Johan did, we define another random walk $U_n=\sum_{i=1}^nZ_i$ with steps $Z_i=s$ or $Z_i=s-t$ so that $Q(s/t)$ equals the probability that $U_n$ will ever reach $-1$, define $f(j)$ as the probablity that $U_n$ will reach $-1$ when it is currently at $j$, and find that $f(j)=f(j+s)/2+f(j+s-t)/2$ for $j\ge0$. The characteristic equation of this recursion is $g(z)=z^t-2z^{t-s}+1=0$. By Lemma 1 and since $f(j)$ tends to $0$ as $j$ tends to infinity, we must have $f(j)=\sum_{i=1}^{t-s}A_ir_i^j$, where $r_1,\dots,r_{t-s}$ are the (necessarily simple) zeroes of $g(z)$ inside the unit circle. Since $f(j)=0$ for $s-t\le j\le -1$, Lemma 2 implies that $Q(s/t)=f(0)=\sum_{i=1}^{t-s}A_i=1-\prod_{i=1}^{t-s}(1-r_i)$.

Now, if $0\lt s/t\lt1$ then $P(s/t)=Q((s+t)/(2t))$, which, by what we have just proved, equals $1-\prod_{i=1}^{t-s}(1-r_i)$, where $r_1,\dots,r_{t-s}$ are the zeroes of $z^{2t}-2z^{t-s}+1$ inside the unit circle. This concludes the proof.

Related Question