[Math] Similarity measure between 2 bi-partite graph.

bipartite-graphsgraph theorysimilarity

Hello there, i need to solve this problem:
I have 2 different bi-partite weighted graph, g1 and g2 and i would like to measure their similarity, g1 and g2 may have different number of vertex and edges and they are a result of a clustering algorithm over different data-sets.

Ideas,hints,thoughts are HIGHLY appreciated.
Best, Francesco.

Best Answer

Is it enough to have something which is defined or do you also want it to be relatively easy to compute?

We can say (as you do) that distance is $0$ when and only when the two graphs are identical in the sense that they have equal numbers of vertices and edges and corresponding edges have the same weight. But it can be a very hard problem to (always) decide if the distance is actually $0.$ It is actually hardest when the weights are all $0,1.$ If the graphs are small you can try every possible way of matching them up. You can be more clever about it than that but it is not easy for large graphs.

Suppose first that there is a given labeling of the vertices $x_1,x_2,\cdots ,x_m;y_1,y_2,\cdots,y_n$ with one part given then the other. Then can represent the graph by the $m \times n$ matrix $A$ whose $i,j$ entry $a_{ij}$ is the ( non-negative) weight of the edge $(x_i,y_j).$ For convenience let $a_{ij}=0$ when $i \gt m$ and/or $j \gt n.$

Then the distance between two labelled bipartite graphs with matrices $A$ and $B$ could be defined as $\sqrt{\sum(a_{ij}-b_{ij})^2}$ or perhaps $\sqrt{\frac{\sum(a_{ij}-b_{ij})^2}{\sum(a_{ij}+b_{ij})^2}}$ if we want maximum distance $1$ (here, exactly when an edge with positive weight in one graph has weight $0$ in the other.)

Now for unlabeled weighted bipartite graphs we could define the distance as the minimum of the distance over all possible orderings. That is a clean clear definition but entirely unpractical in the context of all finite weighted bipartite graphs.

Perhaps in a given setting there is more structure such as trees where each has an obvious dominant heavy spine which is a path with lighter $2$ and $3$ vertex paths hanging off it. Then we (perhaps) just have to try two orders for the path and consider insertions and deletions.

Related Question