Model of Erdos-Renyi with fixed $p$ and asymptotic growth of connected components

graph theoryprobabilitypythonrandom-graphs

I'm doing experiments on random graph denoted as $G(n, p)$, which is Erdos-Renyi graph. So far, there's certainly a transition phase. If $p$ > $T$, where $T > \frac{1}{n}$ is the threshold, then the graph will have only a single connected component.

Is there an analytical solution for small fixed $p$? Could we derive or predict the asymptotic growth, or determine lower/upper bounds? Is there a connection between $p$ and number of connected components $C_p$?

Since current implementation provides a random generator with different seed, a multiple attempts were made to see the overall tendency.

edges = np.arange(5, 200, 1)
probability = 0.1
for attempts in ['r', 'g', 'b', 'y', 'm']:
    components = []
    for edges_cnt in edges:
        G = nx.gnp_random_graph(edges_cnt, probability)
        components.append(nx.number_connected_components(G))
    plt.plot(edges, components, lw = 1, color = attempts, label = 'p = 0.002')

When $p$ is small enough, one can observe the linear (probably $O(n)$) growth. As we increase it, the process becomes more stochastic and it becomes impossible for me to derive the conclusions. p = 0.002 p = 0.01
p = 0.02 p = 0.1

Thank you in advance!

Best Answer

The number of components is still determined by the relationship between $n$ and $p$. In $G_{n,p}$ where we consider $n$ to be the primary parameter and $p$ to be a function of $n$, there are several primary cutoffs. For large $n$, we expect that

  • If $p < \frac1n$, then there are few cycles, and so the number of components is roughly $n - \binom n2p$. (It starts at $n$, and in the absence of cycles, each of an average of $\binom n2p$ edges reduces the number of components by $1$.) As we get closer and closer to $\frac1n$, this approximation gets worse, but the number of components should still be linear in $n$.

  • If $p > \frac1n$, there is a giant component. Estimates here are more annoying to get, but the idea is that for each $k \ll \log n$, the number of components of size $k$ is linear in $n$, and you can estimate the number of them by analyzing a branching process.

  • Once $p$ is sufficiently larger than $\frac1n$ (when $np \to \infty$ as $n \to \infty$), the most frequent small components are isolated vertices, and the average number of these is $n(1-p)^{n-1} \sim n e^{-np}$.

  • For $p$ around $\frac{\log n}{n}$, even the isolated vertices disappear or almost disappear.

If you fix $p$ and vary $n$, the cutoffs appear in the same places. But the problem is that all of the usual approximations for random graphs are valid for large $n$. For large $p$ like $p=0.1$, the cutoffs like $p = \frac1n \iff n = \frac1p$ are not very large, and so the approximations are not very good. But for large $p$, we expect the number of components to shrink down to $1$ very quickly.

Once you're comparing $p=0.01$ or $p=0.001$ or $p=0.00001$, the descriptions above are going to be correct, except to see where they occur, you should solve for $n$ in terms of $p$.

Related Question