About $E_i$ in “An $n^{\frac{5}{2}}$ Algorithm for Maximum Matchings in Bipartite Graphs” by Hopcroft and Karp

algorithmsgraph theorymatching-theory

I am reading "An $n^{\frac{5}{2}}$ Algorithm for Maximum Matchings in Bipartite Graphs" by Hopcroft and Karp. And I cannot understand the following definition of $E_i$.

Because
$$\{(u, v) | (u, v) \in \bar E, v \in L_i, u \in L_i\} = \emptyset,$$

I think the follwoing definition is more natural:

$$E_i = \{(u, v) | (u, v) \in \bar E, v \in L_i, u \notin L_0 \cup L_1 \cup \cdots \cup L_{i-1}\}, i=0, 1, 2, \cdots.$$

Why did the authors define $E_i$ as follows?

The graph $G = (V, E)$ is bipartite if the set of vertices $V$ can be partitioned into two sets, $X$ and $Y$, such that each edge of $G$ joins a vertex in $X$ with a vertex in $Y$. An element of $X$ will be called a boy, and an element of $Y$, a girl.
Let $M$ be a matching in a bipartite graph $G$. We discuss the implementation of Step 1 of Algorithm A, in which a maximal vertex-disjoint set of shortest augmenting paths relative to $M$ is found. First we assign directions to the edges of $G$ in such a way that augmenting paths relative to $M$ become directed paths. This is done by directing each edge in $E – M$ so that it runs from a girl to a boy, and each edge in $M$ so that it runs from a boy to a girl. The resulting directed graph is $\bar G = (V, \bar E)$, where
$$\bar E = \{(y, x) | \{x, y\} \in E – M, x \in X, y \in Y\} \cup \{(x, y) | \{x, y\} \in M, x \in X, y \in Y\}.$$
Next we extract a subgraph $\hat G$ of $\bar G$, with the property that the directed paths of $\hat G$ running from a free girl to a free boy correspond one-to-one to the shortest augmenting paths in $G$ relative to $M$. This is done as follows.
Let $L_0$ be the set of free boys, and let
$$E_i = \{(u, v) | (u, v) \in \bar E, v \in L_i, u \notin L_0 \cup L_1 \cup \cdots \cup L_i\}, i=0, 1, 2, \cdots,$$
$$L_{i+1} = \{u | \text{for some } v, (u, v) \in E_i\}, i = 0, 1, 2, \cdots.$$

Best Answer

The definition Hopcroft and Karp give is more natural because it's a standard description of a breadth-first search. (Well, a breadth-first search in the reversed graph.) It is immediate from this definition that $L_{i+1}$ is disjoint from $L_0 \cup L_1 \cup \dots \cup L_i$, which is very useful to know.

It's true that the conditions \begin{align} u &\notin L_0 \cup L_1 \cup L_2 \cup \dots \cup L_i, \\ u &\notin L_0 \cup L_1 \cup L_2 \cup \dots \cup L_{i-1}, \\ u &\notin L_{i-1} \cup L_{i-3} \cup L_{i-5} \cup \dots \cup L_{i-1\bmod 2} \end{align} are all equivalent, because the graph is bipartite, so we could use any of them in the definition. The second (which you're proposing) and the third are longer, and it would take some small amount of work to explain why they're equivalent to the natural, first definition. (More importantly, they make the reader needlessly stop and think: "Why are we allowing $u \in L_i$? Is that an important case? Wait, no, $u \in L_i$ can't ever happen anyway.")

In exchange, we save on "amount of things we have to check for containing $u$". This is not really a consideration when describing the algorithm. It might theoretically be a consideration for implementing the algorithm, but in practice you'd probably label vertices with a flag "is this vertex in any of the $L_i$ defined so far?" and just check all the in-neighbors of $v$ to see whether they have that flag set or not.

Related Question