[Math] probability of collision from hashing

combinatoricshash functionprobability

So i have a hash table that can hold 100 elements. It currently stores 30 elements. What is the probability that the next 2 inserts will result in at least one collision?

What i did was figure out the sample space to be 100*100=10000, representing all the possible number of different insertions for the 2 insertions (for example: first insertion being in 5th index and second insertion being in 74th index).

Then I tried to get elements of the sample space, which is in the form of (index of first insertion, index of second insertion) that have collisions. First, I counted 30 collisions from the first insertion and multiplied that by 100 since there are 100 possibilities for the second insertion for each of the first insertion. This would get the count of collisions that are guaranteed from first insertion. Then I multiplied this by 2 since there are also 30 collisions from the second insertion and I had to multiply that by 100 since there are 100 possibilities for the first insertion for each of the second insertion. This would get the count of collisions that are guaranteed from second insertion. Since this is overcounting the number of collisions that are the same(For example: if there was a collision at 5th index, first and second insertion being at the 5th index), I subtracted 30 since thats the count of occupied indexes.

Then I added 70 in the case of the first insert not causing collision but the second one causing collision by being inserted in the same place as the first.(For example both insertions being in 7th index when originally index 7 wasnt occupied before the first insertion)

I got .604 as my final answer: (2*30*100-30+70)/10000

Is this the correct answer and approach?

Best Answer

$$P(\text{Colllision}) = 1 - P(\text{no collision}) = 1 - {70\over100}\cdot{69\over 100}.$$