[Math] probability distribution of hitting nodes on a finite graph random walk

graph theorymarkov chainsrandom walks

Consider a finite, undirected, scale-free graph $\{G}$, with uniform edge weights. We define a truncated random walk on $\{G}$ as a random walk that continues for exactly $\{k}$ steps. For an arbitrary truncated random walk of $\{k}$ steps (i.e. beginning at an arbitrary node), what is the probability that a given node $\{n}$ in the graph is hit by the random walk?

Is this determined, as is my intuition, by $\{k}$ and the degree of $\{n}$? Has this been proved? Does the fact that this is a scale-free graph make the question of determining the probability that a node is hit by a random walk more or less difficult?

Edited to add: By scale-free, I mean that the degree distribution of the graph follows a power law. This characteristic is typical of "real-world" graphs. See, for example, scale-free network at Wolfram.

Best Answer

Section IX.3 of "Modern Graph Theory" by Béla Bollobás considers hitting and commute times on a simple undirected graph $G \triangleq (V, E)$ with $\lvert V \rvert = n$ and $\lvert E \rvert = m$. He works with "simple random walks" (SRWs), where the transition probability $P_{xy}$ is $\frac{1}{d(x)}$ if $xy \in E$ and 0 otherwise. He defines $\pi$ to be the stationary distribution $\pi_{x} \triangleq d(x)/2m$ and also fixes an arbitrary distribution $p$ over $X_0$, the starting node of the SRWs under consideration.

On page 311 he defines $S_k(i)$ as the number of times node $i$ is hit in the first $k$ steps of a given SRW. He shows

Theorem 15 We have $\lim_{k \to \infty} \mathrm{E}[S_k(x)/k] = d(x)/2m$, and $(S_k(x)/k)_{x \in V}$ tends to $\mathbf{\pi}$ in probability as $k \to \infty$.

So the answer above is more than an upper bound.

If we consider fixing the starting node $x_0$, then a natural question to ask is how quickly $(S_k(x)/k)_{x \in V}$ tends to the stationary distribution (how quickly you get "lost"). That notion is captured by the mixing time of the SRWs on $G$, discussed in section IX.4 of the same book. It's defined in terms of $\lVert p_t - \pi \rVert$, where $p_t$ is the distribution over $V$ of $x_t$, the node hit on step $t$. It turns out the mixing rate can be calculated in terms of the eigenvalues of the transition matrix $P$. If $\lambda_1 \geq \lambda_2 \geq \dots \lambda_n$ are the eigenvalues of $P$, then the mixing rate is $\max(\lambda_2, \lvert \lambda_n \rvert)$.

With respect to scale-free graphs, a reasonable approach would be to understand their spectra then work out what value of $k$ you want to guarantee you'll stop your walks before getting too lost. I don't know whether that's easy, hard, or done.

Related Question