[Math] Probability that exists at least an edge in the configuration model

graph theoryNetworkprobabilityprobability theoryrandom-graphs

In this period, I am studying some topics on random networks to understand the modularity optimization used in community detection. In particular, I am trying to understand a model called configuration model, which represents a random graph with a given degree sequence.

You can find some descriptions on the topic here:

http://tuvalu.santafe.edu/~aaronc/courses/5352/fall2013/csci5352_2013_L11.pdf
http://homepage.cs.uiowa.edu/~sriram/196/spring12/lectureNotes/Lecture11.pdf

Now, my problem is that I don't understand why the probability $p_{ij}$ that exists at least an edge between two vertices $i$ and $j$ is evaluated as $p_{ij} = \frac{k_i k_j}{2m}$. I suppose that they are evaluating the probability as the edges were independent. Am i right? Are really the edges independent or is it supposed that they are in a large graph? I don't understand that very well because I have noticed that in some graphs the given probability is not a value between 0 and 1.

It's confusing the fact that I have read on some articles that they call it as the "expected number of edges between two vertex" and on other ones that they call it as the "probability that exists at least an edge between two vertex", too. Is it the same thing?

Could you help me understand the given probability?

Thank you.

Best Answer

There is no independence here. The model assumes that the $2m$ "edge stubs" from the vertices are matched at random: you pick a pair of unconnected stubs, connect them, and repeat until all are connected. If vertex $i$ has $k_i$ stubs and a different vertex $j$ has $k_j$ stubs, there are $k_i k_j$ pairs of stubs, one for $i$ and one for $j$. The probability that a particular pair $(stub_1, stub_2)$ are connected is $1/(2m-1)$, since any of the $2m-1$ stubs other than $stub_1$ has equal probability of ending up connected to $stub_1$. By linearity of expected value, the expected number of connections between vertices $i$ and $j$ is then $k_i k_j/(2m-1)$. If $m$ is very large, you might approximate this as $k_i k_j/(2m)$.

The probabilty that there is at least one connection is less than the expected number of connections, if it is possible to have more than one connection. Under appropriate conditions, the probability of having more than one connection will be small, in which case you can use $k_i k_j//(2m-1)$ as an approximation to the probability of at least one connection.

Related Question