[Math] Average degree of a scale-free network.

algorithmsgraph theoryNetworkrandom-graphs

Suppose to generate a scale-free undirected network using the preferential attachment algorithm where with start with an $m$-clique, and we attach each new node to $m$ pre-existing nodes. The probability of attachment is proportional to the current degree $k_i$ of a pre-existing node.
According to Barabasi, the exact degree distribution of the produced network is:

$$\mathbb{P}(k_i = k)=p_k = \frac{2m(m+1)}{k(k+1)(k+2)}.$$

In other word, if we randomly extract a node $i$ from the network, the probability that this node is connected to exactly $k$ nodes is equal to $p_k$.
Of course, if the graph has $N$ nodes, then $k$ ranges in the set $\{m, \ldots, N\}$.

Question 1

How can I prove that:

$$\sum_{k=m}^N p_k = 1?$$

Since $p_k$ depends on $N$, I would say that this is rather difficult to be exactly equal to $1$.

Anyway, I think that the distribution $k$ works for $N \to +\infty.$ Am I right?

Question 2

Suppose that $N \to +\infty$. Then $k \in \{m, \ldots, +\infty\}$.
What is the average degree

$$\overline{k} = \lim_{N \to +\infty}\sum_{k=m}^{N}kp_k?$$

I've exploited

$$\sum_{n=q}^p \frac{1}{n} \simeq \log\left(\frac{p+1}{q}\right),$$

and I get that:

$$\overline{k} = \lim_{N \to +\infty}2m(m+1)\log\left(\frac{(N+2)(m+2)}{(N+3)(m+1)}\right) = 0.$$

Anyway, it seems experimentally (I've implemented the preferential attachemtn algorithm) that:

$$\overline{k} = 2m.$$

which is clearly different from $0$
Where is my mistake?

Best Answer

If $p_k = \frac{2m(m+1)}{k(k+1)(k+2)}$ then the sum in your first question is a telescoping series: $$ \sum_{k=m}^N \frac{2m(m+1)}{k(k+1)(k+2)} = \sum_{k=m}^N \left(\frac{m(m+1)}{k(k+1)} - \frac{m(m+1)}{(k+1)(k+2)}\right) = 1 - \frac{m(m+1)}{(N+1)(N+2)}. $$ So this is a valid probability distribution in the limit as $N \to \infty$.

In the same way, we can evaluate the expected average degree in the limit as $N \to \infty$: $$ \sum_{k=m}^\infty kp_k = \sum_{k=m}^\infty \frac{2m(m+1)}{(k+1)(k+2)} = \sum_{k=m}^\infty \left(\frac{2m(m+1)}{k+1} - \frac{2m(m+1)}{k+2}\right) = \frac{2m(m+1)}{m+1} = 2m. $$ But we can calculate the average degree for any $N$ even more easily than that. At every step, the number of edges increases by $m$, and it starts at $\binom m2$. So when there are $N$ nodes, there are $\binom m2 + m(N-m)$ edges, or $mN - \binom{m+1}{2}$. So the average degree, which is twice the number of edges divided by $N$, is $2m - \frac{m}{N} - \frac{m^2}{N}$. The contribution from the second and third terms is negligible for large $N$, and vanishes as $N \to \infty$.

The mistake in your calculation is that a formula for estimating the sum $\sum \frac1n$ does not tell you how to estimate the sum $\sum \frac1{n(n+1)}$.

Related Question