Standard deviation of number of triangles in Erdos-Renyi uniform random graph G(n,m)

combinatoricsgraph theoryprobabilityrandom-graphs

Erdos-Renyi $G_{n,m}$ random graph means that the graph is uniformly drawn from all graphs with $n$ nodes and $m$ edges. Let $X(e)=1$ if $e\in G_{n,m}$ and 0 otherwise. We write $$N(G_{n,m}):=\sum\limits_{\phi:V\mapsto[n]}\prod\limits_{e\in E}X(\phi(e))$$ being the number of isomorphic copies of triangles in $G_{n,m}$, where $V=\{v_1,v_2,v_3\}$ the node set of some triangle, and $E=\{(v_1,v_2),(v_2,v_3),(v_3,v_1)\}$ the edge set(without direction) of that triangle.

Let's now consider the dense case, i.e. $m=p{n\choose 2}=:pN$, with $p\in(0,1)$ being a constant. Its expectation can then be calculated to be $$\mathbb{E}N(G_{n,m})=\sum\limits_{\phi:V\mapsto[n]}\mathbb{E}\prod\limits_{e\in E}X(\phi(e))=\sum\limits_{\phi:V\mapsto[n]}\mathbb{P}[\forall e\in E, \phi(e)\in G_{n,m}]=\sum\limits_{\phi:V\mapsto[n]}\frac{{N-3\choose m-3}}{{N\choose m}}=\frac{(n)_3(m)_3}{(N)_3},$$

where $(a)_k:=a(a-1)\cdots(a-k+1)$ denotes the falling factorial. Now we want to calculate its standard deviation.

In this paper, on page 2 the authors mentioned that $std(N(G_{n,m}))=\Theta(n^{3/2})$, or $Var(N(G_{n,m}))=\Theta(n^3)$. However, with the following calculation, I reached a result of $\Theta(n^4)$, which matches the $G_{n,p}$ model instead of $G_{n,m}$ I am now considering. I am not sure where I made the mistake.

My approach is as follows.

\begin{align*}
Var(N(G_{n,m})) &= \mathbb{E}\left[\sum\limits_{\phi}(\prod\limits_{e}X(\phi_e)-\mathbb{E}\prod\limits_{e}X(\phi_e))\right]^2\\
&= \mathbb{E}\sum\limits_{\phi,\psi}\left(\prod\limits_{e}X(\phi_e)-\frac{(m)_3}{(N)_3}\right)\left(\prod\limits_{e}X(\psi_e)-\frac{(m)_3}{(N)_3}\right)\\
&= \sum_{k=0}^{3}\sum_{|\phi(E)\cap\psi(E)|=k}\mathbb{E}\left[\prod_{e}X(\phi_e)X(\psi_e)-\left(\frac{(m)_3}{(N)_3}\right)^2\right].
\end{align*}

Note that $k=2,3$ are the same case for triangle count, they both mean that $\phi(V)=\psi(V)$.

(1). For $k=0$: the two maps are disjoint, thus the number of pairs $(\phi,\psi)$ is $(n)_6$, while the contribution term in the variance is $$\mathbb{E}\left[\prod_{e}X(\phi_e)X(\psi_e)-\left(\frac{(m)_3}{(N)_3}\right)^2\right]=\frac{(m)_6}{(N)_6}-\left(\frac{(m)_3}{(N)_3}\right)^2=\Theta\left(\frac{1}{N}\right)=\Theta(n^{-2}),$$

after some work(I think I am correct here…but let's see). So the total contribution is $\Theta(n^4)$.

(2). For $k=1$: the two maps have one edge in common(and thus two nodes), thus number of maps pairs is $6(n)_4$, while the contribution term in the variance is $$\mathbb{E}\left[\prod_{e}X(\phi_e)X(\psi_e)-\left(\frac{(m)_3}{(N)_3}\right)^2\right]=\frac{(m)_5}{(N)_5}-\left(\frac{(m)_3}{(N)_3}\right)^2\approx p^5-p^6,$$

being a constant. So the total contribution is $\Theta(n^4)$.

(3). For $k=2,3$: as $\phi(V)=\psi(V)$, number of maps pairs is $6(n)_3$, and the contribution of each map pair is the contribution term in the variance is $$\mathbb{E}\left[\prod_{e}X(\phi_e)X(\psi_e)-\left(\frac{(m)_3}{(N)_3}\right)^2\right]=\frac{(m)_3}{(N)_3}-\left(\frac{(m)_3}{(N)_3}\right)^2\approx p^3-p^6,$$

being a constant. So the total contribution is $\Theta(n^3)$.

So in summary I think the variance should be $\Theta(n^4)$, which does not match the reference.

Any help is appreciated!!!

Best Answer

It turns out that the term in $n^4$ of (1) is negative and cancels out the term in $n^4$ of (2). With a bit of formal computation magic, we can check that $$ \frac{(m)_6}{(N)_6}=p^6-\frac{15}{m}p^7\left(\frac{1}{p}-1\right)+O\left(\frac{1}{m^2}\right)\\ \frac{(m)_5}{(N)_5}=p^5-p^6+O\left(\frac{1}{m}\right)\\ \left(\frac{(m)_3}{(N)_3}\right)^2=p^6-\frac{6p^7}{m}\left(\frac{1}{p}-1\right)+O\left(\frac{1}{m^2}\right) $$

Now there are $\binom{n}{3}\binom{n-3}{3}$ pairs of disjoint triples of vertices and $2\binom{4}{2}\binom{n}{4}$ pairs of triples of vertices sharing 2 vertices. Putting everything together with $m\sim pn^2/2$, (1) contributes $$ -\frac{n^4}{2}(p^5-p^6)+O(n^3), $$ and (2) contributes $$ \frac{n^4}{2}(p^5-p^6)+O(n^3). $$ This at least shows why the standard deviation of the number of triangles is $O(n^{3/2})$.

Related Question