[Math] The mean square distance of a random walk from the origin

pr.probabilityrandom walksreference-request

I'm wondering whether the following type of problem is a standard one that has been studied by probabilists. The particular case needed (as a lemma that would help with a Polymath project) isn't quite what I'm asking here, but I'll ask a simpler case, since I don't think the case we actually need is significantly different.

Take a standard $n$-step random walk in $\mathbb Z$ (so it starts at the origin and at each step goes right with probability 1/2 and left with probablity 1/2). Let $X_m$ be where the walk has reached after $m$ steps. I am interested in the quantity $\sum_m X_m^2$. The distribution of $X_m$ is roughly normal with standard deviation of order $\sqrt m$, from which it follows easily that the expected value of this quantity has order $n^2$. I would like to prove that the probability that $\sum_mX_m^2\leq c^2n^2$ is, for suitably small $c$, extremely small. For the application I have in mind, it would be enough if "extremely small" meant "with probability at most $n^{-10}$" or something like that, and I'm prepared for $c$ to depend on $n$, as long as it doesn't tend to zero too quickly.

In fact, I think it is necessary for $c$ to tend to zero, because I think that for any fixed $c$ the probability that the $n$-step random walk is always between $-c\sqrt n$ and $c\sqrt n$ is positive. (I wouldn't mind a reference for that actually.) But a heuristic argument that I won't give here suggests that that positive probability depends exponentially on $c$, so I would expect to be OK once $c\ll 1/\log n$.

Best Answer

Let us divide the (time) interval $[0,n]$ into $n/t$ subintervals of length $t$. Let us call the $k$th interval good, if, during that interval, the random walk spends time at least $t/5$ to the left of $x_k-\sqrt{t}$ and at least $t/5$ to the right of $x_k+\sqrt{t}$ (here, $x_k=X_{kn/t}$ is the position of the walk in the beginning of the $k$th time interval). Clearly, the events ($\{k\mbox{th interval is good}\}$, $k=1,\ldots,n/t$) are independent, and there is $\alpha>0$ such that the probability that an interval is good is at least $\alpha$, uniformly in $t$ (you can prove this formally by e.g. comparing the random walk to the Brownian motion).

Next, a good interval contributes an amount at least $\frac{t}{5}\times (\sqrt{t})^2 = \frac{t^2}{5}$ to the sum. Moreover, a Chernoff's bound for the Binomial distribution implies that, with probability at least $1-\exp(-c' \frac{n}{t})$, at least $\frac{\alpha}{2}\times\frac{n}{t}$ intervals will be good. So, at least with the above probability, the sum will be at least $\frac{\alpha}{2}\times\frac{n}{t}\times \frac{t^2}{5} = \frac{\alpha t}{10n}n^2$. Now, take e.g. $t=n/\log^2n$ (or $t=n/(C\log n)$ for large $C$) to get a statement you're aiming at.

Related Question