Suppose we use a hash function $H$ to hash $N$ distinct balls into $M$ distinct bins. Assuming simple uniform hashing, what is the expected number of collisions?
Note that a collision is defined by adding a ball to an already occupied bin. If the already occupied bin has $k$ balls in it, then the number of collisions upon adding a new ball is $k.$
By using expectation, I tried as :
=> 1 × Probability of collision in first insertion +
2 × Probability of collision in second insertion + ………. +
n × Probability of collision in nth insertion
=> $(1 ∗ 0) + (2 ∗ 1/m) + (3 ∗ 2/m) + (4 ∗ 3/m) + … + (n ∗ n−1/m)$
Actually, The answer is $(n^2 – n)/2m$
But, I am not getting the answer. Where am I wrong here ?
Best Answer
The expected number of new collisions caused at the time of inserting the $k$-th ball is $\frac{k-1}{M}$ since it has a $\frac1M$ collision probability with each ball already placed.
Thus the expected number of collisions is
$\frac0M+\frac1M+\frac2M+\cdots+\frac{N-1}{M}=\frac{N(N-1)}{2}\cdot\frac1M$