I am looking for a formula to efficiently compute the probability of choosing $I$ balls with distinct colors from an urn with $N$ balls and $K$ colors, where $K$ might be larger than $I$, without replacement.
As a concrete example: Let's assume we have 10 blue, 139 white, 44 green, and 1 red balls (so $N = 194$ and $K = 4$). Let's assume that we want to draw 2 balls with distinct colors (so $I=2$). For this example, it is tractable to enumerate all possibilities. For example, I computed the probability of first choosing a blue ball and then either a white, green, or red one by computing $\frac{10}{194} * \frac{139+44+1}{193}=0.049$. By computing also the probabilities for the other combinations and summing them all up, I obtained an overall probability of about $43\%$.
In the actual case I want to compute this for, I have a significantly larger number of $K$ and $N$. Also, $I$ is about $K/2$, making it infeasible to naively enumerate the different probabilities. Is there a more efficient way to compute this?
Best Answer
Just use generating functions , lets say that one of the color has $f$ balls in a color , so the generating function of this color is $$\binom{f}{1}x$$
Lets assume that we have $K$ distinct color of balls and they have $f_1 ,f_2,f_K$ balls.Moreover , we want to choose $I$ balls with distinct colors , so find the coefficient of $x^I$ in the expansion of $$\bigg[1+\binom{f_1}{1}x\bigg]\bigg[1+\binom{f_2}{1}x\bigg]..\bigg[1+\binom{f_K}{1}x\bigg]$$.
Moreover, we know that the denominator is $\binom{N}{I}$ ,because all balls are seem distinct when computing probability , so the answer is $$\frac{[x^I]\bigg[\bigg(1+\binom{f_1}{1}x\bigg)\bigg(1+\binom{f_2}{1}x\bigg)..\bigg(1+\binom{f_K}{1}x\bigg)\bigg]}{\binom{N}{I}}$$
$\mathbf{EDITION:}$ For $I=2$
Generating function for $10$ blue ball : $1+\binom{10}{1}x$
Generating function for $139$ white ball : $1+\binom{139}{1}x$
Generating function for $44$ green ball : $1+\binom{44}{1}x$
Generating function for $1$ red ball : $1+\binom{1}{1}x$
where $1$ represent $x^0$ ,i.e, we did not make selection , so by expansion
In link , we see that the coefficient of $x^2$ is $8139$ , so the answer is $$\frac{8139}{C(194,2) =18,721}= 0,434752417$$