Birthday problem – expected number of shared birthdays

birthdayprobability

Given $m$ people and an $n$ possible "days of the year", what is the expected number of days which 2 or more people share as a birthday (if the distribution of birthdays is iid uniform over the possible days) ? (Alternative formulation: We use $n$ distinct colors to paint $m$ objects. Each object gets colored with a uniformly iid randomly chosen color. What is the number of colors that color at least two objects?)

Written more precisely, let $B_i$ be independent uniform random on $S = \{ 1, …, n \}$ for $i = 1, …, m$. What is the expected cardinality of the set of $d \in S$ where there exist $i,j$ not equal with $B_i = B_j = d$?

This sounds similar to birthday problem – expected number of collisions which asks for the number of collisions, but that question was about the number of people who share birthdays, not the number of days shared. The answer to that question is in the range $[0,m]$ while the answer to my question is in the range $[0,n]$.

Ultimately this is actually a question about hash collisions and prefix collisions in a prefix tree, but I deemed the birthday formulation the most familiar and most likely to be helpful to others searching for it in the future.

Best Answer

I gave the answer in a previous question without calculation. Here is the justification

  • The probability a day has no people is $\left(1-\frac1n\right)^m$ so the expected number of days with no people is $n\left(1-\frac1n\right)^m$
  • The probability a day has one person is ${m \choose 1}\frac1n\left(1-\frac1n\right)^{m-1}$ so the expected number of days with one person is $m\left(1-\frac1n\right)^{m-1}$
  • So the expected number of days with two or more people is $$n - n \left(1-\frac1n\right)^m - m \left(1-\frac1n\right)^{m-1}$$
Related Question