This is a counting problem: there are $b^n$ possible assignments of $b$ birthdays to $n$ people. Of those, let $q(k; n, b)$ be the number of assignments for which no birthday is shared by more than $k$ people but at least one birthday actually is shared by $k$ people. The probability we seek can be found by summing the $q(k;n,b)$ for appropriate values of $k$ and multiplying the result by $b^{-n}$.
These counts can be found exactly for values of $n$ less than several hundred. However, they will not follow any straightforward formula: we have to consider the patterns of ways in which birthdays can be assigned. I will illustrate this in lieu of providing a general demonstration. Let $n = 4$ (this is the smallest interesting situation). The possibilities are:
- Each person has a unique birthday; the code is {4}.
- Exactly two people share a birthday; the code is {2,1}.
- Two people have one birthday and the other two have another; the code is {0,2}.
- Three people share a birthday; the code is {1,0,1}.
- Four people share a birthday; the code is {0,0,0,1}.
Generally, the code $\{a[1], a[2], \ldots\}$ is a tuple of counts whose $k^\text{th}$ element stipulates how many distinct birthdates are shared by exactly $k$ people. Thus, in particular,
$$1 a[1] + 2a[2] + ... + k a[k] + \ldots = n.$$
Note, even in this simple case, that there are two ways in which the maximum of two people per birthday is attained: one with the code $\{0,2\}$ and another with the code $\{2,1\}$.
We can directly count the number of possible birthday assignments corresponding to any given code. This number is the product of three terms. One is a multinomial coefficient; it counts the number of ways of partitioning $n$ people into $a[1]$ groups of $1$, $a[2]$ groups of $2$, and so on. Because the sequence of groups does not matter, we have to divide this multinomial coefficient by $a[1]!a[2]!\cdots$; its reciprocal is the second term. Finally, line up the groups and assign them each a birthday: there are $b$ candidates for the first group, $b-1$ for the second, and so on. These values have to be multiplied together, forming the third term. It is equal to the "factorial product" $b^{(a[1]+a[2]+\cdots)}$ where $b^{(m)}$ means $b(b-1)\cdots(b-m+1)$.
There is an obvious and fairly simple recursion relating the count for a pattern $\{a[1], \ldots, a[k]\}$ to the count for the pattern $\{a[1], \ldots, a[k-1]\}$. This enables rapid calculation of the counts for modest values of $n$. Specifically, $a[k]$ represents $a[k]$ birthdates shared by exactly $k$ people each. After these $a[k]$ groups of $k$ people have been drawn from the $n$ people, which can be done in $x$ distinct ways (say), it remains to count the number of ways of achieving the pattern $\{a[1], \ldots, a[k-1]\}$ among the remaining people. Multiplying this by $x$ gives the recursion.
I doubt there is a closed form formula for $q(k; n, b)$, which is obtained by summing the counts for all partitions of $n$ whose maximum term equals $k$. Let me offer some examples:
With $b=5$ (five possible birthdays) and $n=4$ (four people), we obtain
$$\eqalign{
q(1) &= q(1;4,5) &= 120 \\
q(2) &= 360 + 60 &= 420 \\
q(3) &&= 80 \\
q(4) &&= 5.\\
}$$
Whence, for example, the chance that three or more people out of four share the same "birthday" (out of $5$ possible dates) equals $(80 + 5)/625 = 0.136$.
As another example, take $b = 365$ and $n = 23$. Here are the values of $q( k;23,365)$ for the smallest $k$ (to six sig figs only):
$$\eqalign{
k=1: &0.49270 \\
k=2: &0.494592 \\
k=3: &0.0125308 \\
k=4: &0.000172844 \\
k=5: &1.80449E-6 \\
k=6: &1.48722E-8 \\
k=7: &9.92255E-11 \\
k=8: &5.45195E-13.
}$$
Using this technique, we can readily compute that there is about a 50% chance of (at least) a three-way birthday collision among 87 people, a 50% chance of a four-way collision among 187, and a 50% chance of a five-way collision among 310 people. That last calculation starts taking a few seconds (in Mathematica, anyway) because the number of partitions to consider starts getting large. For substantially larger $n$ we need an approximation.
One approximation is obtained by means of the Poisson distribution with expectation $n/b$, because we can view a birthday assignment as arising from $b$ almost (but not quite) independent Poisson variables each with expectation $n/b$: the variable for any given possible birthday describes how many of the $n$ people have that birthday. The distribution of the maximum is therefore approximately $F(k)^b$ where $F$ is the Poisson CDF. This is not a rigorous argument, so let's do a little testing. The approximation for $n = 23$, $b = 365$ gives
$$\eqalign{
k=1: &0.498783 \\
k=2: &0.496803\\
k=3: &0.014187\\
k=4: &0.000225115.
}$$
By comparing with the preceding you can see that the relative probabilities can be poor when they are small, but the absolute probabilities are reasonably well approximated to about 0.5%. Testing with a wide range of $n$ and $b$ suggests the approximation is usually about this good.
To wrap up, let's consider the original question: take $n = 10,000$ (the number of observations) and $b = 1\,000\,000$ (the number of possible "structures," approximately). The approximate distribution for the maximum number of "shared birthdays" is
$$\eqalign{
k=1: &0 \\
k=2: &0.8475+\\
k=3: &0.1520+\\
k=4: &0.0004+\\
k\gt 4: &\lt 1E-6.
}$$
(This is a fast calculation.) Clearly, observing one structure 10 times out of 10,000 would be highly significant. Because $n$ and $b$ are both large, I expect the approximation to work quite well here.
Incidentally, as Shane intimated, simulations can provide useful checks. A Mathematica simulation is created with a function like
simulate[n_, b_] := Max[Last[Transpose[Tally[RandomInteger[{0, b - 1}, n]]]]];
which is then iterated and summarized, as in this example which runs 10,000 iterations of the $n = 10000$, $b = 1\,000\,000$ case:
Tally[Table[simulate[10000, 1000000], {n, 1, 10000}]] // TableForm
Its output is
2 8503
3 1493
4 4
These frequencies closely agree with those predicted by the Poisson approximation.
Best Answer
This mistake was put in evidence in written conversations among Fermat, Pascal, and eminent French mathematicians in 1654 when the former two were considering the "problem of points." A simple example is this:
The false argument begins by examining the set of possible outcomes, which we can enumerate:
Because Player A has two chances of winning and B has only one chance, the odds in favor of B are (according to this argument) 1:2; that is, B's chances are 1/3. Among those defending this argument were Gilles Personne de Roberval, a founding member of the French Academy of Sciences.
The mistake is plain to us today, because we have been educated by people who learned from this discussion. Fermat argued (correctly, but not very convincingly) that case (1) really has to be considered two cases, as if the game had been played out through both flips no matter what. Invoking a hypothetical sequence of flips that wasn't actually played out makes many people uneasy. Nowadays we might find it more convincing just to work out the probabilities of the individual cases: the chance of (1) is 1/2 and the chances of (2) and (3) are each 1/4, whence the chance that A wins equals 1/2 + 1/4 = 3/4 and the chance that B wins is 1/4. These calculations rely on axioms of probability, which were finally settled early in the 20th century, but were essentially established by the fall of 1654 by Pascal and Fermat and popularized throughout Europe three years later by Christian Huyghens in his brief treatise on probability (the first ever published), De ratiociniis in ludo aleae (calculating in games of chance).
The present question can be modeled as 100 coin flips, with heads representing death and tails representing survival. The argument for "1 in 100" (which really should be 1/101, by the way) has exactly the same flaw.