Balls in Bins – Probability of Exactly $m$ Cells Containing Exactly $k$ Balls

balls-in-binscombinatoricsprobabilityprobability distributions

$r$ balls are randomly distributed into $n$ cells (the balls are indistinguishable).

What is the probability that there is exactly $m$ cells that contains exactly $k$ balls (each one)? That is, the rest of the cells must contain $j \neq k$ balls, each one.

My attempt was:

$$\frac
{ {n \choose m} {{r-mk+n-m-1} \choose {r-mk}} }
{ {{r+n-1} \choose {n-1}} } $$

Explain: we distribute $r$ indistinguishable balls into $n$ cells, so that's $|\Omega|={{r+n-1} \choose {n-1}}$ in the denominator.

In the numerator, I choose $m$ cells and puts inside each one of them $k$ balls (that make $mk$ balls in total. we have ${n \choose m}$ possibilities of doing that). Then, the rest of the $r-mk$ balls are distributed to the $n-m$ remaining cells (that's the ${{r-mk+n-m-1} \choose {r-mk}}$).

I believe my mistake was that i didn't made sure that the rest of the $n-m$ cells does not containing $k$ balls as well, so the numerator is "too big".
But i have no idea how to fix this.

Best Answer

As already mentioned in another answer, dealing with a probability problem one must treat all the balls as distinct (for example we can imagine the balls be numbered from $1$ to $r$) so that the number of all possible events is $n^r$.

Let
$$ N(r,n,k,m) $$ be the number of the events with exactly $m$ cells containing exactly $k$ balls.

The solution for $m=0$ can be easily found with help of the inclusion-exclusion principle and reads: $$ N(r,n,k,0)=\sum_{j=0}^{\min(n,\frac rk)}(-1)^j\binom nj\frac{r!}{(k!)^j(r-kj)!}(n-j)^{r-kj}, $$ where the sum starts with the unrestricted number of possible events $n^r$. Generally the structure of the term is the following: the factor $\binom nj$ stays for the number of ways to choose $j$ cells filled with $k$ balls each, the multinomial coefficient $\frac{r!}{(k!)^j(r-kj)!}=\binom r{\underbrace{k,k,\dots,k}_j,r-kj}$ counts the number of ways to fill these $j$ cells with $k$ balls each, and finally $(n-j)^{r-kj}$ counts the number of ways to distribute the remaining $r-kj$ balls between the remaining $n-j$ cells.

Recognizing this the answer to the general case is simply: $$ \begin{aligned} N(r,n,k,m)&=\binom nm\frac{r!}{(k!)^m(r-km)!}N(r-km,n-m,k,0)\\ &=\binom nm\frac{r!}{(k!)^m(r-km)!}\sum_{j=m}^{\min(n,\frac rk)}(-1)^{j-m}\binom {n-m}{j-m}\frac{(r-km)!}{(k!)^{j-m}(r-kj)!}(n-j)^{r-kj}\\ &=\frac{n!r!}{m!}\sum_{j=m}^{\min(n,\frac rk)}\frac{(-1)^{j-m}(n-j)^{r-kj}}{(n-j)!(j-m)!(k!)^{j}(r-kj)!},\\ \end{aligned} $$ and the probability in question reads: $$ \frac{N(r,n,k,m)}{n^r}. $$