[Math] Correspondence between spanning trees and bases of the column space of the incidence matrix

graph theorylinear algebra

I am struggling with a problem found in the Graph Theory textbook by Bondy and Murty. The goal is to prove the following result:

Let $G$ be a connected (undirected) graph, and let $M$ be its incidence matrix.

Show that there is a one-to-one correspondence between the bases of the column space of $M$ over $GF(2)$ and the spanning trees of $G$.

This result does not seem right to me, because the two sets in questions don't even have the same cardinality. For instance, if $G$ is a tree on $n$ vertices, it has only one spanning tree, whereas its incidence matrix has dimension $n-1$ over $GF(2)$, and therefore has many bases.

What am I not understanding?


Notations:

  • $GF(2)$ is the field with 2 elements
  • if $G$ is a graph with $n$ vertices and $m$ edges, its incidence matrix is the matrix $M$ with $n$ rows and $m$ columns, with a 1 in cell $(v, e)$ if and only if edge $e$ is a link incident to vertex $v$.

Best Answer

I think the key here that the problem is not making clear is that we should be looking for a correspondence between spanning trees of $G$ and bases of the column space of $M$ consisting of columns of $M$.

Let $C$ be the set of columns of $M$ and $E=E(G)$ be the edges of $G$. There is a natural bijection $f:C\to E$ that sends the $k$-th column to the $k$-th edge (or rather sends the column corresponding to $e$ to $e$). This function $f$ induces a bijection $g:\mathcal{P}(C)\to S$ where $\mathcal{P}$ denotes powerset and $S=\{G'\subseteq G\mid V(G')=V(G)\}$ is the collection of all subgraphs of $G$ that have the same vertex set as $G$. Now I claim that $A\in\mathcal{P}(C)$ is linearly independent if and only if $g(A)$ is a forest. It's easier, though, to see that $A$ has a linear dependence if and only if $g(A)$ has a cycle. Given $A\in\mathcal{P}(C)$ with minimal linear dependence $c_1+c_2+\cdots+c_k=0$ (we're working over over $GF(2)$ so we can assume all our coefficients are 1) we see that $f(c_1)$, $f(c_2)$, $\ldots$, $f(c_k)$ form a cycle in $g(A)$. The opposite is also true. Given a graph $H\in S$ with a cycle. We can pull that cycle back to a linear dependence in $g^{-1}(H)$. Finally we just need to say that a maximal linearly independent set is a basis and a maximal forest is a spanning tree to see that $g$ gives a correspondence between bases of the column space of $M$ consisting of columns of $M$ and spanning trees of $G$.

It's good to note that this proof would still essentially work over other coefficient rings. Notably it works with integer coefficient given that you change bases to maximal independent sets and you associate an (arbitrary) orientation to each edge. For the homologically inclined reader I'll also note that the benefit to working with $GF(2)$ coefficients is just that it gives you a very natural correspondence between 1-chains of $G$ (thought of as a 1-dimensional complex) and subgraphs of $G$. And finally that the matrix $M$ is simply the boundary matrix from 1-chains to 0-chains.