Distribution of size of connected components in Erdos-Renyi random graphs in the subcritical case

poisson processrandom-graphs

In Alon and Spencer's The Probabilistic Method, in the subcritical case, we have theorems 11.4.2 and 11.6.1 stating that in $G(n,p=c/n)$ where $c<1$, for a constant $k$
$$\lim_{n\to\infty}\Pr[|C(v)|=k]=\Pr[T_c^{po}=k]=\frac{e^{-ck}\cdot (ck)^{k-1}}{k!}$$

I know that in the subcritical case the largest connected component is w.h.p. of size $O(\log n)$, so can we say something about $\Pr[|C(v)|=k]$ when $k=k(n)$ is super constant, e.g. $k=a\cdot \log n$ for some constant $a>0$?

Best Answer

Here is just what we can extract from the Poisson limit.

As $k \to \infty$, $\Pr[T_c^{po} = k] \to 0$. So, for any $\epsilon>0$, there is $K = K(\epsilon)$ such that $\Pr[T_c^{po} > K] < \epsilon$ and therefore $$\lim_{n \to \infty} \Pr[|C(v)| > K] \le \epsilon.$$

Let $\mathbf X = |\{v : |C(v)| > K\}|$: the number of vertices in components larger than $K$. Then $\mathbb E[\mathbf X] \le \epsilon n$ by the above and by linearity of expectation, so $$ \Pr[\mathbf X > \sqrt{\epsilon} n] \le \frac{\epsilon n}{\sqrt{\epsilon} n} = \sqrt{\epsilon} $$ which can be made arbitrarily small. This can be summarized as "with high probability, all but $o(n)$ of the vertices are in components of size $O(1)$" but that's a lot of implied limits in a very short sentence.

This analysis by itself leaves open a range of possibilities: there could be a single component of size $\Theta(\log n)$, or many, as long as there are $o(n)$ vertices in such components.


This is not really the most interesting statement about $|C(v)|$. Theorems 11.6.2 and 11.6.3 in the same section are more helpful, because they do not require $k$ to be constant.

In particular, the probability that a vertex is in a component of size $\ge K \log n$ is sandwiched between two branching process probabilities: $$ \Pr[T^{bin}_{n-K\log n,p} \ge K \log n] \le \Pr[|C(v)| \ge K \log n] \le \Pr[T^{bin}_{n-1,p} \ge K \log n]. $$ Alon and Spencer leave out any argument that $T^{bin}_{n,p}$ and $T^{po}_{np}$ (as they mention at the end of section 11.6) so I will also just assume this because I am even more lazy than they are. Then $(n-K\log n)p$ and $(n-1)p$ are $c + o(1)$, so $$ \Pr[|C(v)| \ge K \log n] = \Pr[T^{po}_{c+o(1)} \ge K \log n] = \sum_{k=K \log n}^\infty \frac{e^{-(c+o(1))k}((c+o(1))k)^{k-1}}{k!}. $$ This gives you what you need to be very very careful, but now I'm going to be very sloppy and say that for large $k$, this is approximately a geometric series, which yields $$ \Pr[|C(v)| \ge K \log n] = n^{K(1 - c + \log c + o(1))}. $$

Related Question