Solved – Community detection and modularity

I am reading the book "Network science" of Barabasi and in particular the chapter on community detection.

If I understand correctly, modularity is a goodness factor of partition calculated by a certain algorithm: the greater the value of modularity and better is the structure of the communities found.

I tried several algorithms in R on the same network:

library(igraph)
library(GGally)
library(ggplot2)
library(ggdendro)

# load dataset
net <- read.graph("./ds.gml", format = c("gml"))

# find degree
deg <- igraph::degree(net, mode = "all")

###### GIRVAN & NEWMANN (divisive) ##################################################
girvNew <- cluster_edge_betweenness(net, modularity = TRUE)

girvNew_sizesComm <- sizes(girvNew) 
girvNew_numComm <- length(girvNew_sizesComm)
girvNew_modularity <- modularity(girvNew)

print(girvNew_numComm)
print(girvNew_sizesComm)
print(girvNew_modularity)

girvNew_den <- as.dendrogram(girvNew)

###### GREEDY ################################################################
fastgreedy <- fastgreedy.community(net)

fastgreedy_sizesComm <- sizes(fastgreedy)
fastgreedy_numComm <- length(fastgreedy_sizesComm)
fastgreedy_modularity <- modularity(fastgreedy)

print(fastgreedy_numComm)
print(fastgreedy_sizesComm)
print(fastgreedy_modularity)

fastgreedy_den <- as.dendrogram(fastgreedy)

###### WALKTRAP ################################################################
walktrap <- cluster_walktrap(net)

walktrap_sizesComm <- sizes(walktrap) 
walktrap_numComm <- length(walktrap_sizesComm) 
walktrap_modularity <- modularity(walktrap)

print(walktrap_numComm)
print(walktrap_sizesComm)
print(walktrap_modularity)

walktrap_den <- as.dendrogram(walktrap)

###### LOUVAIN ################################################################
louvain <- cluster_louvain(graph = net, weights = NULL)

louvain_sizesComm <- sizes(louvain)
louvain_numComm <- length(louvain_sizesComm)
louvain_modularity <- modularity(louvain)

print(louvain_numComm)
print(louvain_sizesComm)
print(louvain_modularity)

The result I get are:

Looking at the table I would say that the algorithm of detection communties more efficient is Louvain, then Girvan-Newmann, Wolktrap and Greedy.
Is this statement right?

The book says that the first algorithm (to Girvan-Newmann) calculates all partitions and doesn't choose the best one.
To do this you need to select the best cut of the dendrogram based on modularity.

Having said that, what means the value of modularity found (0.5380681)? But above all, how do I find the modularity for each step of algorithm?

What I would like to get is something similar to this chart:

That is, a chart that allows me to see the various modular values obtained and can then choose the best one.
This shows Modularity for each cut of the dendrogram.

Which algorithm is more efficient?

Perhaps I am misunderstanding what you mean by "efficiency", but I assume you mean computational efficiency. You can't know that from looking the table you provided- you need to check the CPU time, e.g., using system.time.

library(igraph) # Random graph net <- sample_gnm(100, 300, directed = F, loops = F) # CPU time for modularity algorithms system.time(cluster_edge_betweenness(net, modularity = TRUE)) # user system elapsed # 0.160 0.001 0.159 system.time(fastgreedy.community(net)) # user system elapsed # 0.002 0.000 0.004 system.time(cluster_walktrap(net)) # user system elapsed # 0.002 0.000 0.003 system.time(cluster_louvain(graph = net, weights = NULL)) # user system elapsed # 0.001 0.000 0.001

You can see that there are differences in the CPU time used by the four algorithms, and the Louvain method is the fastest.

What does the modularity value mean?

In general, you can think of modularity as the fraction of edges lying within modules minus the expected fraction of edges lying within modules if there is no modularity. But the single modularity value that is output from these algorithms is the value of Q for the optimal partition found by the algorithm. The higher this maximum modularity is, the better the partition is. Thus, based on the table you included, the Louvain method found the best partition.

To be clear, the igraph function you are using to find communities with the Girvan-Newman algorithm DOES return the optimal communities found using the algorithm- you don't need to select it yourself. Look at the ?help page for the cluster_edge_betweenness function you use, and see the description of the modularity argument:

Logical constant, whether to calculate the maximum modularity score, considering all possibly community structures along the edge-betweenness based edge removals.

So the modularity you get IS the maximum modularity, and the partition you can extract using the communities function will give you the maximum partition. You don't need to find it yourself.

Solved – Community detection and modularity

Best Answer

Which algorithm is more efficient?

What does the modularity value mean?

How can you extract the modularity score for each step of the Girvan-Newman algorithm?

Related Question

Best Answer

Which algorithm is more efficient?

What does the modularity value mean?

How can you extract the modularity score for each step of the Girvan-Newman algorithm?

Related Solutions

Solved – Newman’s modularity clustering for graphs

How to Perform Community Detection in Weighted Social Networks

Related Question