This is a counting problem: there are $b^n$ possible assignments of $b$ birthdays to $n$ people. Of those, let $q(k; n, b)$ be the number of assignments for which no birthday is shared by more than $k$ people but at least one birthday actually is shared by $k$ people. The probability we seek can be found by summing the $q(k;n,b)$ for appropriate values of $k$ and multiplying the result by $b^{-n}$.
These counts can be found exactly for values of $n$ less than several hundred. However, they will not follow any straightforward formula: we have to consider the patterns of ways in which birthdays can be assigned. I will illustrate this in lieu of providing a general demonstration. Let $n = 4$ (this is the smallest interesting situation). The possibilities are:
- Each person has a unique birthday; the code is {4}.
- Exactly two people share a birthday; the code is {2,1}.
- Two people have one birthday and the other two have another; the code is {0,2}.
- Three people share a birthday; the code is {1,0,1}.
- Four people share a birthday; the code is {0,0,0,1}.
Generally, the code $\{a[1], a[2], \ldots\}$ is a tuple of counts whose $k^\text{th}$ element stipulates how many distinct birthdates are shared by exactly $k$ people. Thus, in particular,
$$1 a[1] + 2a[2] + ... + k a[k] + \ldots = n.$$
Note, even in this simple case, that there are two ways in which the maximum of two people per birthday is attained: one with the code $\{0,2\}$ and another with the code $\{2,1\}$.
We can directly count the number of possible birthday assignments corresponding to any given code. This number is the product of three terms. One is a multinomial coefficient; it counts the number of ways of partitioning $n$ people into $a[1]$ groups of $1$, $a[2]$ groups of $2$, and so on. Because the sequence of groups does not matter, we have to divide this multinomial coefficient by $a[1]!a[2]!\cdots$; its reciprocal is the second term. Finally, line up the groups and assign them each a birthday: there are $b$ candidates for the first group, $b-1$ for the second, and so on. These values have to be multiplied together, forming the third term. It is equal to the "factorial product" $b^{(a[1]+a[2]+\cdots)}$ where $b^{(m)}$ means $b(b-1)\cdots(b-m+1)$.
There is an obvious and fairly simple recursion relating the count for a pattern $\{a[1], \ldots, a[k]\}$ to the count for the pattern $\{a[1], \ldots, a[k-1]\}$. This enables rapid calculation of the counts for modest values of $n$. Specifically, $a[k]$ represents $a[k]$ birthdates shared by exactly $k$ people each. After these $a[k]$ groups of $k$ people have been drawn from the $n$ people, which can be done in $x$ distinct ways (say), it remains to count the number of ways of achieving the pattern $\{a[1], \ldots, a[k-1]\}$ among the remaining people. Multiplying this by $x$ gives the recursion.
I doubt there is a closed form formula for $q(k; n, b)$, which is obtained by summing the counts for all partitions of $n$ whose maximum term equals $k$. Let me offer some examples:
With $b=5$ (five possible birthdays) and $n=4$ (four people), we obtain
$$\eqalign{
q(1) &= q(1;4,5) &= 120 \\
q(2) &= 360 + 60 &= 420 \\
q(3) &&= 80 \\
q(4) &&= 5.\\
}$$
Whence, for example, the chance that three or more people out of four share the same "birthday" (out of $5$ possible dates) equals $(80 + 5)/625 = 0.136$.
As another example, take $b = 365$ and $n = 23$. Here are the values of $q( k;23,365)$ for the smallest $k$ (to six sig figs only):
$$\eqalign{
k=1: &0.49270 \\
k=2: &0.494592 \\
k=3: &0.0125308 \\
k=4: &0.000172844 \\
k=5: &1.80449E-6 \\
k=6: &1.48722E-8 \\
k=7: &9.92255E-11 \\
k=8: &5.45195E-13.
}$$
Using this technique, we can readily compute that there is about a 50% chance of (at least) a three-way birthday collision among 87 people, a 50% chance of a four-way collision among 187, and a 50% chance of a five-way collision among 310 people. That last calculation starts taking a few seconds (in Mathematica, anyway) because the number of partitions to consider starts getting large. For substantially larger $n$ we need an approximation.
One approximation is obtained by means of the Poisson distribution with expectation $n/b$, because we can view a birthday assignment as arising from $b$ almost (but not quite) independent Poisson variables each with expectation $n/b$: the variable for any given possible birthday describes how many of the $n$ people have that birthday. The distribution of the maximum is therefore approximately $F(k)^b$ where $F$ is the Poisson CDF. This is not a rigorous argument, so let's do a little testing. The approximation for $n = 23$, $b = 365$ gives
$$\eqalign{
k=1: &0.498783 \\
k=2: &0.496803\\
k=3: &0.014187\\
k=4: &0.000225115.
}$$
By comparing with the preceding you can see that the relative probabilities can be poor when they are small, but the absolute probabilities are reasonably well approximated to about 0.5%. Testing with a wide range of $n$ and $b$ suggests the approximation is usually about this good.
To wrap up, let's consider the original question: take $n = 10,000$ (the number of observations) and $b = 1\,000\,000$ (the number of possible "structures," approximately). The approximate distribution for the maximum number of "shared birthdays" is
$$\eqalign{
k=1: &0 \\
k=2: &0.8475+\\
k=3: &0.1520+\\
k=4: &0.0004+\\
k\gt 4: &\lt 1E-6.
}$$
(This is a fast calculation.) Clearly, observing one structure 10 times out of 10,000 would be highly significant. Because $n$ and $b$ are both large, I expect the approximation to work quite well here.
Incidentally, as Shane intimated, simulations can provide useful checks. A Mathematica simulation is created with a function like
simulate[n_, b_] := Max[Last[Transpose[Tally[RandomInteger[{0, b - 1}, n]]]]];
which is then iterated and summarized, as in this example which runs 10,000 iterations of the $n = 10000$, $b = 1\,000\,000$ case:
Tally[Table[simulate[10000, 1000000], {n, 1, 10000}]] // TableForm
Its output is
2 8503
3 1493
4 4
These frequencies closely agree with those predicted by the Poisson approximation.
Best Answer
You can see your argument is not correct by applying it to the standard birthday problem, where we know the probability is 50% at 23 people. Your argument would give $\frac{{23\choose 2}{365\choose 1}}{365^{23}}$, which is very small. The usual argument is to say that if we are going to avoid a coincidence we have $365-(k-1)$ choices for the $k$th person's birthday, so the probability of no coincidence in $K$ people is $\prod_{k=1}^K \frac{365-k+1}{365}$
Unfortunately, there is no such simple argument for more than two coincident birthdays. There is only one way (up to symmetry) for $k$ people to have no two-way coincidence, but there are many, many ways to have no four-way coincidence, so the computation as you add people is not straightforward. That's why R provides
pbirthday()
and why it is still only an approximation. I'd certainly hope this wasn't a class assignment.The reason your argument is not correct is that it undercounts the number of ways you can get 4 matching months. For example, it's not just that you can choose any month of the 12 as the matching one. You can also relabel the other 11 months arbitrarily (giving you a factor of 11! ). And your denominator of $12^{18}$ implies that the ordering of the people matters, so there are more than $18\choose 4$ orderings that have 4 matches.