Graph Theory – Expected Graph Edit Distance Between Two Random Graphs

expected valuegraph theorygraph-isomorphismrandom-graphs

Consider Erdos-Renyi random graphs $G(n,p)$.
Let us independently sample two graphs $G_1$ and $G_2$ following $G(n,p)$.
What is the expected graph edit distance (GED) between $G_1$ and $G_2$?
Since the number of nodes is the same for $G_1$ and $G_2$, the expected GED is
$$
\mathbb{E}[\text{GED}(G_1, G_2)] = \min_{G' \simeq G_1} |E(G') \setminus E(G_2)| + |E(G_2) \setminus E(G')|,
$$

where $G' \simeq G_1$ means $G'$ and $G_1$ are isomorphic.

A related, and possibly equivalent, question would be,
what is the expected number of overlapping edges considering graph isomorphism, i.e., what is
$$
\mathbb{E}[\text{overlap}(G_1, G_2)] := \max_{G' \simeq G_1} |E(G') \cap E(G_2)|.
$$


We have some existing questions w.r.t. cut distance but without answer.

Best Answer

For any fixed choice of $G'$ isomorphic to $G_1$, the edit distance is $\text{Binomial}(\binom n2, \frac12)$, which is $\frac{n(n-1)}{4}$ on average.

By a Chernoff bound, for $0<\delta<1$, the probability that the edit distance is less than $(1-\delta)\frac{n(n-1)}{4}$ is at most $\exp\left(-\frac{\delta^2 n(n-1)}{8}\right)$. This is less than $n^{-n}$ for $\delta = \sqrt{\frac{8\log n}{n}}$, at which point we can simply use the union bound over all $n!$ choices of $G'$.

Thus, with high probability, the edit distance between $G_1$ and $G_2$ is at least $\frac{n^2}{4} - O(n^{3/2} \sqrt{\log n})$: there is no $G'$ isomorphic to $G_1$ that we can take which does significantly better than average.

Also, by taking $\delta = \frac{\log n}{n}$ in a Chernoff bound facing the other direction, we see that with high probability just $G_1$ by itself (without applying any automorphism) does not do significantly worse than average: not worse than $\frac{n^2}{4} + O(n \log n)$.

The error probability I hid under the clause "with high probability" is exponentially small in the first case, and $\exp(-O((\log n)^2))$ in the second case. But in all cases, the edit distance is going to be between $0$ and $\binom n2$. Therefore outliers do not contribute anything significant to the expected value: it also lies in the small interval around $\frac{n^2}{4}$ described above.

Related Question