Upper bound on the commute time for adjacent nodes

expected valuemarkov chainsmarkov-processupper-lower-bounds

I have read about a lemma (6.5 in "Randomized Algorithms" by Motwani and Raghavan) to place an upper bound to the commute time between two adjacent nodes $u$ and $v$ in an ergodic Markov chain which is associated with a finite undirected graph $G=(V, E), |E|=m$. I have looked at the provided proof but I do not seem to be able to grasp it fully.

Specifically, the lemma proves that:

$h_{uv} + h_{vu} \le 2m\ \ \text{if}\ (u,v)\in E$ (i.e. when $u$ and $v$ are adjacent) where $h_{uv}$ is the hitting time between $u$ and $v$, i.e. the mean first passage time from $u$ to $v$.

The proof proceeds as follows:

  • define a new Markov Chain where the states are associated with the information about the edge we are traversing in a random walk and the direction of this traversal. Hence we use $(u,v) \neq (v,u)$, as our states and we have $2m$ of these (each undirected edge can be seen as a pair of directed edges with opposite direction);
  • observe that the transition matrix $Q$ of this new chain has non-zero entries of the form $Q_{(u,v)(v,w)} = P_{vw} = 1/d(v)$ where $d(.)$ is the degree function in the undirected graph $G$;
  • observe that $Q$ is doubly stochastic and hence the unique stationary probability vector would have all the entries equal to $1/2m$;
  • by the Fundamental Theorem of Markov Chains we get that $h_{(v,u)(v,u)} = \frac{1}{Q_{(v,u)(v,u)}} = 2m$;
  • now observe that the commute time between $u$ and $v$ is associated with the expected return time $h_{(v,u)(v,u)}$ (pay attention to the position of v and u) to derive the upper bound, as explained in the following extract (preserved to avoid introducing interpretative errors)

enter image description here

I understood why we could eliminate the condition thanks to the memoryless property, indeed since we start from $u$ we can ignore the information on how we got there.

However, I did not get why we can claim that $2m$ is an upper bound to the commute time. To me, it seems that $h_{(v,u)(v,u)}$ represents the expected length of a cyclic path from $u$ passing from $v$ but with the additional constraint that the last step is to take directly the edge $(v,u)$, hence being a sort of a "lucky case" for the return path $h_{vu}$, which in general could be much longer, impacting the expected number of steps needed. Consequently, based on the line of reasoning followed in the proof, my intuition would lead me to say this is more of a lower bound instead of an upper one.

I know that my intuition is wrong since for example in a clique of $n$ nodes the expected commute time would be $2(n-1)$, which is lower than two times the number of edges $2n(n-1)$, however, I cannot see where I am falling into error. May you help me to understand what aspect of the proof am I misunderstanding? Any hint would be much appreciated.

Best Answer

I think my issue was trying to think about the number of all possible paths which is not really relevant for the expected value. Indeed we may have both longer and shorter paths with respect to the more constrained walk, but we get no information on the distribution of these, and hence cannot draw any valuable conclusion for the expected value. Instead, looking at the paths shorter than some threshold and thinking about the probabilities of their occurrence lead me to an interpretation that seems to make sense.

Call $X$ a random variable that indicates the number of steps needed to perform the commute between $u$ and $v$, for some values of $v$ and $u$ such that $(u,v)\in E$, let's call this "option 1" (hence we will have $E(X) = h_{uv} + h_{vu}$). Let $Y$ be the random variable that indicates the number of steps needed to perform a walk that leads from $u$ to $v$ (without visiting $v$ in between) and then takes the edge $(v,u)$ to get back to $u$ ("option 2"). In this scenario $P(X \le k) \ge P(Y \le k)$ for some number of steps $k\in \mathbb{N}$ since "option 1" is less restrictive and hence constitutes a superset of "option 2". Consequently, since $P(X\ge k) = 1- P(X<k)$ we get $P(X > k) \le P(Y > k)$ and hence $$E(X)=\sum_{k=0}^{+\infty}kP(X=k) \le \sum_{k=0}^{+\infty}kP(Y=k) = E(Y) = 2m$$ which yields the expected result $E(X) \le 2m$.

Note: this is a self-validated answer, if you think that something is missing or wrong just let me know with a comment.

Related Question