[Math] Generalizations of the Birkhoff-von Neumann Theorem

co.combinatoricsconvex-polytopesconvexity

The famous Birkhoff-von Neumann theorem asserts that every doubly stochastic matrix can be written as a convex combination of permutation matrices.

The question is to point out different generalizations of this theorem, different "non-generalizations" namely cases where an expected generalization is false, and to briefly describe the context of these generalizations.

A related MO question: Sampling from the Birkhoff polytope

Best Answer

I am cheating a little to give this answer, because I am fairly sure that it is part of Gil's motivation in asking the question. The most natural generalization of the Birkhoff hypothesis to quantum probability is only true for qubits. (It might also be true for a qubit tensor a classical system; I did not check that case.)

A quantum measurable space is a von Neumann algebra. We are most interested in the finite-dimensional case, where classically "measurable space" is just a fancy name for the random variables on a finite set. A finite-dimensional von Neumann algebra is a direct sum of matrix algebras. In particular, $M_2$ is called a qubit and $M_d$ is called a qudit.

To make a long story short, the Birkhoff hypothesis can be stated for a direct sum of $a$ copies of $M_b$, or $aM_b$. In this setting, a doubly stochastic map $E$ is a linear map from $aM_b$ to itself that preserves trace, that preserves the identity element, and that is completely positive. In this setting, $E$ is completely positive if it takes positive semidefinite elements of $aM_b$ to positive semidefinite elements, and if $E \otimes I$ also has that property on the algebra $aM_b \otimes N$ for another von Neumann or $C^*$-algebra $N$. The natural analogue of permutation matrices are the *-algebra automorphisms of $aM_b$. These are permutations of the matrix blocks, composed with maps of the form $E(x) = uxu^*$, where $u$ is a unitary element of $aM_b$. The question as before is whether the doubly stochastic maps are the convex hull of the automorphisms.

This Birkhoff hypothesis is true for $M_2$, false for $M_d$ for $d \ge 3$, and I should check it for $nM_2$. It is true for $aM_1 = a\mathbb{C}$, because then it is the usual Birkhoff-von Neumann theorem.

I am left wondering about two infinite classical versions of Birkhoff's theorem, for the algebras $\ell^\infty(\mathbb{N})$ and $L^\infty([0,1])$. In the former case, one would ask whether any stochastic map that preserves counting measure (even though counting measure is not normalized) is an infinite convex sum of permutations of $\mathbb{N}$. In the latter case, whether any stochastic map that preserve Lebesgue measure is a convex integral of measure-preserving permutations of $[0,1]$. Addendum: At least the discrete infinite case is addressed, with generally positive results, in this review and in this older review. The older paper also raises the continuous question but with no results. However, with some more Googling I found this counterexample paper.


Since Gil asks for a reference, a recent one is Unital Quantum Channels - Convex Structure and Revivals of Birkhoff's Theorem, by Mendl and Wolf.


Here also is a more orthodox combinatorial generalization of the Birkhoff theorem, and also another case that I once encountered that is between a generalization and a non-generalization. Since Gil now offers a bounty, maybe it's better to merge this answer with the other one.

A doubly stochastic matrix can be interpreted as a flow through a directed graph, with unit capacities. (See Unimodular matrix in Wikipedia; I learned about this long ago from Jesus de Loera.) Any such graph has a polytope of flows, called a network flow polytope. Any network flow polytope has integer vertices, because it is a totally unimodular polytope.

A totally unimodular polytope is a polytope whose facets have integer equations, and with the property that any maximal, linearly independent collection of facets intersects in an integer point because their matrix has determinant $\pm 1$. In particular the vertices are such intersections, so the vertices are all integral. This is a vast generalization of Birkhoff's theorem that comes from generalizing one of the proofs of Birkhoff's theorem.

Example: An alternating-sign matrix is equivalent to a square ice orientation of a square grid. The square ice orientations can be defined by a network flow, so you obtain an alternating-sign-matrix polytope. The generalized Birkhoff theorem in this case says that every vertex of the polytope is an alternating-sign matrix, in fact that every integer point of the $n$-dilated polytope is a sum of $n$ alternating-sign matrices.


The other case that I encountered was the polytope of fractional perfect matchings of a non-bipartite set with $2n$ elements. By contrast, the Birkhoff polytope is the case of a bipartite set with $n$ elements of each type. By definition, it is the polytope of non-negative weights assigned to the edges of the complete graph on $2n$ vertices, such that the total weight at each vertex is 1. Strictly speaking, the Birkhoff theorem is false; not every vertex is a perfect matching. Instead, all of the vertices are combinations of matched pairs, and odd cycles with weight $\frac12$.

At first glance this looks like bad news for the application of computing a perfect matching or the optimum perfect matching of a graph. Indeed, if instead you take the convex hull of the perfect matchings, the result is a polytope with exponentially many facets. However, a good algorithm exists anyway; there is a version of the simplex algorithm that only ever uses polynomially many of the facets.