[Math] Probability of a slot having exactly $K$ elements

combinatoricshash functionstatistics

From this question asked in an interview:

Consider a hash table with $M$ slots. Suppose hash value is uniformly
distributed between $1$ to $M$. Suppose we put $N$ keys
into this $M$-slotted hash table, what is the probability that there
will be a slot with $K$ elements? $K$ could vary from $0$ to $N$.

I'm not very good in statistics, that's why I'm asking for your help. What is the probability and how to compute it?

I could think of the following. Let $a_i$ be the number of elements in slot $i$. Then $P(a_i = K) = P(a_j = K) = P_K, \forall i, j$. And the answer is given by $P^* = 1 – (1 – P_K)^M$. But how to comupte $P_K$?

PS. The only question I could answer is "what is the expected value of the number of elements in a single slot?". And the answer is $N/M$. Am I right?

Best Answer

Ok, I think I get the answer thank to the link posted by leonbloy.

Consider the first slot. We have $N$ trials and every trial may put a value in first slot or not. Let $X_i$ be equal to $1$ if $i$-th trial puts a value into first slot or $0$ otherwise. Thus we have $P(X_i = 0) = \frac{M-1}{M}$ and $P(X_i = 1) = \frac{1}{M}$.

Now $P_K$ is the probability, that exactly $K$ of all $X_i$ are equal to $1$. Since $i$ varies from $1$ to $N$, there are $C_N^K$ possible situations. And in every situation we have $K$ ones and $N-K$ zeros for $X_i$. So the probability of each situation is $P(X_i = 0)^{N-K} P(X_i = 1)^K$.

Hence we have the following formula for $P_K$:

$$ P_K = C^K_N \left(\frac{M-1}{M}\right)^{N-K} \left(\frac{1}{M}\right)^{K} = C^K_N \frac{(M-1)^{N-K}}{M^N} $$

And the answer is given by:

$$ P^* = 1 - (1 - P_K)^M $$

Could anyone bother checking this solution?

Related Question