[Math] How to measure the similarity between two graph networks

graph theoryNetwork

I have a set of undirected graph networks, 6 nodes each with weighted edges. I would like to compare each with a reference graph network which also has the same 6 nodes but with different weights. What is a good method to compare them and have a similarity score, say between zero to one?

UPDATE: To clarify my problem, I am going to explain what is the problem I am looking at. I have a series of protein structures which are identical in their amino acid sequence (same nodes). However they are slightly different in their spatial orientation. I want to see which amino acids interact together (if two amino acids are within a certain cutoff distance, then it would mean they are interacting — an edge between the two nodes). They may be so close so the edge weight would be 1 or less until it reaches zero (weights of edges). I want to see which protein structures (graph networks) are similar in the way their amino acids interact?

Best Answer

Here is one idea. Let $G_1$ and $G_2$ be different graphs on the same vertices, let $L_1$ and $L_2$ be their graph Laplacians, and let the superscript plus symbol denote the pseudoinverse (e.g., $A^+$ is the pseudoinverse of $A$). Then one can define something like:

$$\text{similarity}(G_1, G_2) := \exp \left(-\frac{||L_1^+ - L_2^+||^2}{||L_1^+||~||L_2^+||}\right).$$

The norm $||\cdot||$ can be any matrix norm. I would recommend, for example, the Frobenius norm. Another good choice could be the induced $2$-norm.

Anyways, the idea is that the graph Laplacian is related to diffusion (e.g., of randomly moving particles, or heat) in the graph, in that the $i$'th column of the psdueoinverse matrix $L^+$ is the steady-state result of putting a constant source of particles (or heat, etc) at node $i$ and letting them diffuse randomly through the graph, with the probability of transmission between two nodes related to the edge weight between those nodes. So, $||L_1^+ - L_2^+||$ measures the difference in the graphs in terms of how they physically differ in a diffusion process.

Then dividing by $||L_1^+||~||L_2^+||$ is done to make the similarity measure scale-invariant, and the exponential $\exp\left(-\dots\right)$ is taken to map the values between $0$ and $1$.

This is just my crazy idea so take it with a grain of salt. I don't know if anyone else has done this already; probably someone has.

Best Answer

Related Solutions

[Math] the definition of a network in graph theory