Solved – Finding natural groups / clusters in an undirected graph / over several undirected graphs

clusteringgraph theory

What kind of methods are there to find natural groups or clusters within an undirected graph structure? I am new to graph theory, but the project seems to have confronted me with questions that could use it.

1) For one graph.

2) Over a set of independent graphs that may have the same nodes.

I've heard of e.g. Tarjan's strongly connected components algorithm that works for directed graphs. Are there similar ones for undirected graphs? The number of groups within the graph is unknown, and should depend strictly on the data (though an arbitrary threshold can be given), and there is no problem if some nodes end up as monads that fit nowhere.

For Question 2, I've cobbled together a method that tests for each edge in a set whether the nodes exist in other sets, and reports the ratio of sets that, if they contain one of these nodes, contain exactly the same edge. (R script here.) Any idea if there's an established method with more or less the same technique, or which methods I should use instead?

Many thanks!

Best Answer

The problem you want to solve is called community detection. You will find many related questions on CrossValidate and StackOverflow, such as the following:

For a good introduction, I'd advise you to have a look at Fortunato's review: Community detection in graphs.

In your case, since you want to treat several networks containing the same nodes (I am supposing your question 2 is about clusters of nodes, by opposition to clusters of graphs), it amounts to performing community detection on a multiplex graph, i.e. a graph containing several different types of relationships. Note some authors also call them multislice or multilayer, but these words can sometimes mean something different, though.

Although this problem is less studied than community detection in plain networks, there're a few works focusing on this issue. You could start with these:

Related Question