[Math] Finding connected components in a graph using BFS

algorithmsgraph theory

I'd like to know how do I change the known BFS algorithm in order to find all of the connected components of a given graph. The original algorithm stops whenever we've colored an entire component in black, but how do I change it in order for it to run through all of the components? And how do I distinguish between one component to the other?

Best Answer

Use an integer to keep track of the "colors" that identify each component, as @Joffan mentioned. Start BFS at a vertex $v$. When it finishes, all vertices that are reachable from $v$ are colored (i.e., labeled with a number). Loop through all vertices which are still unlabeled and call BFS on those unlabeled vertices to find other components.

Below is some pseudo-code which initializes all vertices with an unexplored label (an integer 0). It keeps a counter, $componentID$, which vertices are labeled with as they are explored. When a connected component is finished being explored (meaning that the standard BFS has finished), the counter increments. BFS is only called on vertices which belong to a component that has not been explored yet.

// input: graph G
// output: labeling of edges and partition of the vertices of G
LabelAllConnectedComponents(G):
    // initialize all vertices and edges as unexplored (label is 0)
    for all u ∈ G.vertices()
        setLabel(u, UNEXPLORED)
    for all e ∈ G.edges()
        setLabel(e, UNEXPLORED)

    // call BFS on every unlabeled vertex, which results in
    // calling BFS once for each connected component
    componentID = 1
    for all v ∈ G.vertices()
        if getLabel(v) == 0:
            BFS(G, v, componentID++)

// standard breadth-first-search algorithm that works on one component
BFS(G, s, componentID):
    L[0] = new empty sequence
    insert vertex s at the end of L[0]
    setLabel(s, componentID)
    i = 0
    while L[i] is not empty:
        L[i+1] = new empty sequence
        for all v ∈ L[i].elements()
            for all e ∈ G.incidentEdges(v)
                if getLabel(e) == UNEXPLORED
                    w ← opposite(v,e)
                if getLabel(w) == UNEXPLORED
                    setLabel(e, DISCOVERY)
                    setLabel(w, componentID)
                    L[i+1].insertLast(w)
                else
                    setLabel(e, CROSS)
        i = i+1

The total running time is $O(|V| + |E|)$ since each edge and vertex is labeled exactly twice - once to initialize and again when it's visited.