[Math] Calculating the Birthday Problem The Other Way Around

birthdayprobability

As most of you probably know, the birthday problem is about finding the probability that any two people share the same birthday, in a room with N people. Since 23 people in the room is the number more frequently used, let's use it here as well.

On Wikipedia and on most probability books the approach to solve this is to first calculate P', the probability that no two people will share the same birthday, and then do 1 – P'. So you basically do (364/365)(364/365)…(343/365). This multiplication will give you 0.4927, 1-P' is 0.5073. So far so good.

What I want to know is how you do the calculation the other way around. That is, how you calculate P straight away, and not by finding 1-P'.

The first idea that came to me is this one:

  1. Put the first person in the room. The probability of this person sharing a birthday is 0.
  2. Put the second person in the room. The probability of this person sharing the birthday with the first would be 1/365.
  3. Put the third person in the room. The probability of this person sharing the birthday with anyone would be 2/365.

So on and so forth, until the 23rd person, whose probability of sharing a birthday would be 22/365.

However this reasoning is flawed, as if you sum those probabilities the result would be 244/365, which is 0.6684 and not 0.5073.

What's wrong with the above reasoning, and what's the correct approach?

Update: As Thomas Andrews points out below, the problem is probably related to some cases being counted twice (i.e., 1 and 2 share and birthday and 3 and 4 share too). In this case we need to shave the individual probabilities a bit, which makes sense since the result should be lower that what we have right now. How to do it though?

Update 2: I think I found the answer. See below.

Best Answer

Well, you can certainly compute $p(n+1)/p(n)$. But it's not a trivial expression. Let $q(n)=1-p(n)$. Then we know that $$q(n+1)=q(n)\left(1-\frac{n}{365}\right)$$ So $$\begin{align}p(n+1) = 1-q(n+1) &= 1-q(n)\left(1-\frac{n}{365}\right)\\& = 1-(1-p(n))\left(1-\frac{n}{365}\right)\\=p(n) +\frac{(1-p(n))n}{365}\end{align}$$

So $\dfrac{p(n+1)}{p(n)}$ is going to be a messy expression of $p(n)$.

Similarly $p(n+1)-p(n)=\frac{(1-p(n))n}{365}$.

Essentially, this means that we get a matching birthday in $n+1$ people if we get a matching birthday in the first $n$, or if we don't in the first $n$ and the $n+1$st matches one of the $n$ previous ones.

For the example of $n=3$:

$$\begin{align}p(1)&=0\\p(2)&=p(1)+\frac{1(1-p(1))}{365} = \frac{1}{365}\\p(3)&=p(2)+\frac{2(1-p(2))}{365}=\frac{1}{365} + \frac{2\cdot 364}{365^2}\approx 0.0082042\end{align}$$