Efficiently sample edges from a graph in relation to its spanning tree

algorithmsgraph theoryprobability

Consider a connected, unweighted, undirected graph $G$. Let $m$ be the number of edges and $n$ be the number of nodes.

Now consider the following random process. First sample a uniformly random spanning tree of $G$ and then pick an edge from this spanning tree uniformly at random. Our process returns the edge.

If I want to sample many edges from $G$ from the probability distribution implied by this process, is there a more efficient (in terms of computational complexity) method than sampling a new random spanning tree each time?

Best Answer

While the other answer is correct, it requires the computation of $|E| + 1$ many determinants. There is a faster route when $|E|$ is large. The first thing to note is Kirchoff's theorem which states that if $T$ is a uniform spanning tree then $$P(e \in T) = \mathscr{R}(e_- \leftrightarrow e_+)$$ where $e = \{e_-, e_+\}$ and $\mathscr{R}(a \leftrightarrow b)$ is the effective resistance between $a$ and $b$ when each edge is given resistance $1$. This implies that the probability an edge is sampled in your process is $$\mathscr{R}(e_- \leftrightarrow e_+)/(|V| - 1).$$

Thus we only need to compute the effective resistance.

If we let $L$ denote the graph Laplacian and $L^+$ to be its Moore-Penrose pseudoinverse, then

$$\mathscr{R}(a \leftrightarrow b) = (L^+)_{aa} + (L^+)_{bb} - 2 (L^+)_{ab}. $$

(See this master's thesis for some nice discussion and references.)

Thus, the only computational overhead for computing the marginals is computing a single psuedoinverse. Depending on how large $|E|$ is, this may be faster than computing $|E|$ many determinants.

EDIT: some discussion on complexity

The Pseudoinverse of an $n \times n$ matrix can be done in $O(n^3)$ time. So computing $L^+$ takes $O(|V|^3)$ time. We have to compute this for $|E|$ many edges, so the above computes all marginals in $O(|E| |V|^3)$ time. Conversely, a determinant can be done in, say, $O(n^{2.3})$ time. So the other answer has complexity $O(|E|^2 |V|^{2.3}).$ Since $G$ is connected, $|E| \geq |V|-1$ and so this algorithm is always faster (asymptotically, at least).

Best Answer

Related Solutions

[Math] Showing existence of a spanning tree in a graph with two kinds of edges using $k$ of one kind of edge

Related Question