A curious variant of the classical 2D random walk: allowing duplication and vanishing

probabilityprobability theoryrandom walkrecreational-mathematicsstochastic-processes

Background. Recall the standard random walk on a 2D grid (i.e. $\mathbb{Z}^2$). A person starts at the origin. At every iteration, the person moves in one of the four directions (up, down, left, or right), each with a probability $1/4$. It is well-known that the person will at some point return to the origin with probability $1$.

We now consider the following variant. A person starts at the origin. At the first iteration, for each of the four points surrounding the person, a 'duplicate' of the original person is placed at this point with probability $1/4$; we remove the person at the origin. We do not require that there be exactly one duplicate. It can happen that there are more than one, or even that there are none altogether (in which case the simulation halts). At the following iterations, we repeat the process for each of the duplicates that we have: at each of the four points surrounding a given duplicate, a new duplicate is placed with probability $1/4$, and the original duplicate is removed.

Example. A process could go as follows:
$$\{(0,0)\} \Rightarrow \{(0,1)\} \Rightarrow \{(0,0),(0,2)\} \Rightarrow \{(0,1),(0,1)\} \Rightarrow \{(1,1)\} \Rightarrow \{\}.$$
Notice that it can happen that more than one duplicate is at the same spot, and that the process halts after the duplicate on $(1,1)$ no longer formed any new duplicates.

Question. We simulate the random walk described above. Throughout the simulation, we count the number of times that a duplicate is formed at the origin. What is the probability that at least one duplicate will return to the origin?

Follow-up question. What is the probability distribution for the number of returns to the origin, and can we derive from this the expected number of returns?

What I know.

  • I am vaguely aware that the simulation will halt (because there are no more duplicates) in a finite amount of time with probability $1$. (This is probably a well-known result for the set people working within this area, but I am not an element thereof.) So we need not worry about the number of returns being infinite.

  • The probability that the simulation halts at the first iteration is $(3/4)^4 \approx 0.32$, thus giving a trivial (and weak) upper bound for the probability of returning at least once, at about $0.68$.

  • I have run the simulation 1.000.000 times in Python. The numerical results are as follows. It states the number of simulations that yielded a given number of returns to the origin.
    \begin{array}{|c|c|c|c|c|c|c|c|c|c|c|c|}
    \hline
    \text{Returns} & 0 & 1 & 2 & 3 & 4 &5&6&7&8&9&10& >10 \\ \hline
    \text{Runs} & 682919 & 143705 & 56318 & 29616 & 17856 & 11876 & 8512 & 6342 & 4898 & 3964 & 3312 & 30682 \\ \hline
    \end{array}
    Here's a plot of the first 80 values. The $x$-axis represents the number of returns; the $y$-value represents the number of runs with $x$ returns. The values at $x = 0,1$ have been cut off from the image so as to make the rest visible.
    enter image description here
    We find that the probability of returning at least once is approximately $0.317$.

  • Remarkably, there are also a bunch of extreme outliers. When I ran 100.000 runs without artificial termination, about one in a hundred simulations yielded more than a hundred returns. Two runs even yielded thousands of returns. (One yielded $8205$ returns. It lasted for $23230$ iterations, and at its peak there were $13769$ duplicates on the grid. The other gave even more: $13823$ returns after $70578$ iterations, with a peak number of $44232$ duplicates.) These outliers strongly influence the expected value. Estimates of the expected value are inconsistent. Though in the comments, user joriki gives a compelling argument in the comments that we should expect the value to be infinite.

Best Answer

I doubt that you'll find a closed form for the probability of at least one return to the origin, but we can compute it to high accuracy using the recurrence

$$ p_x=1-\prod_{n\in N(x)}\left(1-\frac{p_n}4\right)\;, $$

where $p_x$ is the probability that a duplicate at $x$ results in at least one duplicate at the origin and the product is over the four neighbours $n$ of $x$.

This Java code numerically solves this recurrence on square grids of increasing linear dimension $d$, with zero boundary conditions outside the grid considered, with the condition $p_O=1$ at the origin. The resulting approximate return probabilities $q_d$ are

\begin{array}{r|l} d&q_d\\\hline 2&0.2709071974706553\\ 4&0.31195166907949623\\ 8&0.317093534115794\\ 16&0.3175509956401311\\ 32&0.3175807088889516\\ 64&0.3175822392633748\\ 128&0.31758230687588884\\ 256&0.3175823095925555\\ 512&0.3175823096957242\\ 1024&0.31758230969951595 \end{array}

So the probability of at least one return is about $0.3175823097$, in agreement with your simulations.

Related Question