Expected number of elements before 1 out of 2 hash tables are full?

coupon-collectorexpected valueharmonic-numbersprobability

From a previous asked question it became clear to me that the expected number of elements to fill up one hash table (assuming ideal randomness) is just the Harmonic number: $mH_m$ where $m$ is the size of the table (collisions could just be solved with chaining or whatever. Don't care.)

However, a more tricky question is:

What is the expected number of elements we need before one of the hash tables are full? I.e. I want to know how fast it takes before one of the tables are full. We can assume that choosing either table $a$ or $b$ has probability $P[a|b]= \frac{1}{2}$ (see figure below).

Best Answer

There might be an elegant explanation of the result, but here is the bruteforce calculation.

We fix $m$ and define $E(a, b)$ as the expected number of elements to fill one table, if there are still $a$ and $b$ empty entries in the two tables.

Thus $E(0, *) = E(*, 0) = 0$ and we are looking for $E(m, m)$.

For $a, b > 0$, by considering the possibilities of the next element, we have: $$E(a, b) = 1 + \frac a {2m}E(a - 1, b) + \frac b {2m} E(a, b - 1) + (1 - \frac{a + b}{2m})E(a, b)$$ which simplifies to: $$(a + b) E(a, b) = 2m + aE(a - 1, b) + bE(a, b - 1).$$

If we define $F(a, b) = \frac 1 {2m} E(a, b)$, then we have the recurrence relation $$(a + b) F(a, b) = 1 + aF(a - 1, b) + bF(a, b - 1).$$

It is not totally obvious, but one can show by induction that $F(a, b) = H_a + H_b - H_{a + b}$ for all $a, b \geq 0$, where $H_n = \frac 1 1 + \dots + \frac 1 n$ denotes the harmonic number.

Proof: We simply need to verify that $$(a + b)(H_a + H_b - H_{a + b}) = 1 + a(H_{a - 1} + H_b - H_{a + b - 1}) + b(H_a + H_{b - 1} - H_{a + b - 1})$$ which is a routine calculation.

Therefore we have $F(m, m) = 2H_m - H_{2m}$ and hence $$E(m, m) = 2m F(m, m) = 4mH_m - 2mH_{2m}.$$

Best Answer

Related Solutions

[Math] Expected maximum number of collisions for universal hash function

[Math] Expected number of colliding pairs in hashing (example)

Related Question