[Math] Probability of an edge appearing in a spanning tree

graph theorypr.probability

Hi guys, let's say I have a connected, undirected graph with many nodes. I am interested in finding the probability that an edge appears in any spanning tree of the graph. I could apply some of the suggestions given in Number of spanning trees which contain a given edge but the problem is each spanning tree of the graph is not assigned an equal probability.

For the original graph structure, we assign a non-zero weight to every edge. For each spanning tree, we take the product $P_i$ of all the weights of the edges present in that tree. The probability of this spanning tree is then computed as $\frac{P_i}{Z}$ where $Z = \sum_i P_i$ is the normalization term. $Z$ has a closed form expression in terms of the determinant (cf. http://www.ri.cmu.edu/pub_files/pub2/meila_marina_2000_1/meila_marina_2000_1.pdf).

I have tried to enumerate all spanning trees of the graph and check if the edge appears in the spanning tree and sum all the probabilities of the relevant spanning trees. Unfortunately, MATLAB cannot deal with a large number of data along the process (10^32 spanning trees in my case). I was wondering if anyone has a more efficient way of finding the probability of an edge appearing in a spanning tree. Perhaps some closed-form solution?

Any insights?

Best Answer

Qiaochu gave the gist of the solution, but here is an observation and a new question. Suppose $e$ is the edge from $i$ to $j$ ($i\lt j$) and has weight $w$. Let $A$ be the Laplace/Kirchoff matrix with row $j$ and column $j$ struck out. Then $|A|$ is the (weighted) number of spanning trees, as we know. Deleting the edge $e$ just subtracts $w$ from the $i$-th diagonal entry of $A$; call that matrix $A_e$ and note that $|A_e|$ is the number of spanning trees that avoid $e$. So $P_e = |A_e|/|A|$ is the probability that a random spanning tree avoids $e$. However, $A_e$ differs from $A$ only in the $i$-th column, so using Cramer's rule backwards we find that $P_e$ is the $i$-th entry of $\boldsymbol x$, where $A\boldsymbol{x}=\boldsymbol{b}$ and $\boldsymbol{b}$ is the $i$-th column of $A$ with $w$ subtracted from the $i$-th entry.

In other words, we can forget determinants and find $P_e$ by solving one linear equation.

The extra question is whether additional probabilities (e.g., the probability of avoiding a given set of edges) can also be found with a single linear equation solution. Inclusion-exclusion gets exponential pretty quick...

Related Question