The probability that the first collision occurs at the Kth insertions

hash functionprobabilityprobability theory

Question

Consider a hash table with $n$ buckets, where external (overflow) Chaining is used to resolve collisions. The hash function is such that the probability that a key value is hashed to a particular bucket is $\frac{1}{n}$. The hash table is initially empty and $K$ distinct values are inserted in the table.
What is the probability that the first collision occurs at the $K^{th}$ insertions?

Approach

For the first collision occurs at the $K^{th}$ insertions,
Insertion from $1$ to $k-1$ should not collide and then at $k^{th}$ itteration it should collide.

Insertion from $1$ to $k-1$ should not collide
$$=\frac{n}{n} \times \frac{n-1}{n} \times \dots \times \frac{n-(k-2)}{n}$$

But i am not getting how to derive probaility for $k^{th}$ itteration collisions
please help me out!

Best Answer

You're almost there. The probability for the $k$-th insertion to produce a collision, given that the first $k-1$ insertions haven't, is $\frac{k-1}n$, as $k-1$ buckets are occupied. Thus the overall probability for the first collision to occur at the $k$-th insertion is

$$ \frac{n!(k-1)}{(n-(k-1))!n^k}\;. $$