Probability that $k$ out of $m$ bins of limited capacity are full after throwing $n$ balls

balls-in-binscombinatoricsinclusion-exclusionprobability distributions

Consider $m$ distinguishable bins of limited capacity $c$ each. After sequentially assigning $n$ indistinguishable balls uniformly (over all bins that are NOT yet full), what is the probability that $k$ out of the $m$ bins are full, i.e. contain exactly $c$ balls?

EDIT: I am considering the mechanism in which balls are launched into the bins sequentially, rather than laid out simultaneously.

From the answer to The probability of distributing K balls over N boxes of size M with at least Q boxes empty. and with the help of https://www.mathpages.com/home/kmath337/kmath337.htm I understand that the number of ways to allocate $n$ indistinguishable balls to $m$ distinguishable bins of capacity $c$ is given by
$$N(n,m,c)= \sum_{v=0}^{m}\left(-1\right)^{v}
{m \choose v}
{ m +n -v\left(c+1\right)-1\choose
n -v\left(c+1\right)}$$

The number of ways to do so with exactly $k$ bins being full is
$$N_k(n,m,c)={m \choose k } N(n-k\cdot c, m-k, c-1)$$

Contrary to what the answer cited above suggests, these ways do not appear to be equally likely, however, so using
$$P(k)=\frac{N_k(n,m,c)}{N(n,m,c)}$$
appears to be incorrect. To see this, consider the special case $n=3$, $m=3$, $c=2$. The probability that none of the bins is full should be $2/9$, while the probability that exactly one bin is full should be $7/9$.

full bins with limited capacity after throwing balls addresses this question for the case of a distribution that is uniform over ALL bins, whereas I am interested in the case in which it is uniform over bins that are still available.

Best Answer

First let's try and make clear that

distributing undistinguishable balls into distinguishable bins does not fully specify which stochastic mechanism we are actually considering, and that is frequently cause of misunderstanding and erroneous conclusions.

Second, please allow me to change the symbols as to keep congruence with other related posts that I am going to cite.
So let's speak of $s$ undistinguishable balls, put into $m$ distinguishable bins, each with the same max capacity $r$.

a) Balls laid into the bins

This is what is considered in the Mathpage article that you cite.

In this case we are looking for $$N_{\,b} (s,r,m) = \text{No}\text{. of solutions to}\;\left\{ \begin{gathered} 0 \leqslant \text{integer }x_{\,j} \leqslant r \hfill \\ x_{\,1} + x_{\,2} + \cdots + x_{\,m} = s \hfill \\ \end{gathered} \right.$$ which is given by the closed sum $$ N_b (s,r,m)\quad \left| {\;0 \leqslant \text{integers }s,m,r} \right.\quad = \sum\limits_{\left( {0\, \leqslant } \right)\,\,k\,\,\left( { \leqslant \,\frac{s}{r+1}\, \leqslant \,m} \right)} {\left( { - 1} \right)^k \binom{m}{k} \binom { s + m - 1 - k\left( {r + 1} \right) } { s - k\left( {r + 1} \right)}\ } $$ as thoroughly explained in this related post.
In particular note the way of expressing the second binomial, which allows to waive from the bounds on the sum.

Also note that the "mechanism" of laying the balls in the bins, when the capacity is unlimited leads to a total
number of ways which is $$ N_b (s,s,m) = \left( \matrix{ s + m - 1 \cr s \cr} \right) $$ i.e. the number of weak compositions of $s$ into $m$ parts, which also is the "Stars&bars" mechanism, and by that we can say that we "are launching the bins (the separators, the bars) into the balls".

Then your question turns out into computing :
- the number of ways to choose $q$ out of $m$ bins to fill up;
- the number of ways to distribute the remaining $s-qr$ balls into $m-q$ bins, with capacity $r-1$
i.e. $$ \bbox[lightyellow] { N_f (s,r,m,q) = \left( \matrix{ m \cr q \cr} \right)N_b (s - qr,r - 1,m - q) }$$

b) Balls thrown into the bins

Instead, by "launching the balls into the bins" normally it is understood that for each ball we have $m$ choices where to launch it and thus a total of $$m^s$$ equiprobable events, when the capacity is not limited.
That is quite different from the above, and corresponds to the "mechanism" in which the balls are labelled with the launching sequence, and they land and stack one over the other inside each bin. So each bin is either empty or contains a subset of $\{1,2, \cdots, s \}$.

Now, $m^s$ is the number of $s$-tuples $(b_1, b_2, \ldots, b_s)$, with $b_k$ representing the landing bin of the $k$-th ball.
But this representation is not helpful for counting the number of balls into the same bin, and we have better to refer to the following splitting of $m^s$ $$ \eqalign{ & m^{\,s} = \sum\limits_{\left( {0\, \le } \right)\,\,k\,\,\left( { \le \,m} \right)} {\left\{ \matrix{ s \cr k \cr} \right\}m^{\,\underline {\,k\,} } } = \cr & = \sum\limits_{\left( {0\, \le } \right)\,\,k\,\,\left( { \le \,m} \right)} {\underbrace {\;\left( \matrix{ m \cr k \cr} \right)\;}_{\matrix{ {{\rm choice}\,k\,} \cr {{\rm non - empty}\,{\rm bins}} \cr } }\underbrace {\;\left\{ \matrix{ s \cr k \cr} \right\}\;}_{\matrix{ {{\rm partition }\left\{ {{\rm 1}{\rm ,} \cdots {\rm ,s}} \right\}} \cr {{\rm into}\,k\,{\rm sub - sets}} \cr } }\underbrace {\,k!\;}_{\matrix{ {{\rm permute}\,{\rm the}} \cr {{\rm }k{\rm subsets(bins)}} \cr } }} \cr} $$ which hinges upon the the Stirling N. of the 2nd kind.

Introducing the limitation on the capacity of the bins, i.e. on the size of the sub-sets, we need to call into play the Restrained Stirling N. 2nd kind, indicated by $\left\{ \matrix{ s \cr k \cr} \right\}_{\,r}$.

Necessarily proceeding very concisely and schematically,
denote denote as

$ L_{\,b\,} (s,r,m) $
the No. of lists of $m$ sub-sets $ \left[ {\left\{ {S_{\,1} } \right\},\left\{ {S_{\,2} } \right\}, \cdots ,\left\{ {S_{\,m} } \right\}} \right]$
partitioning $\left\{ {1,\,2,\, \cdots ,\,s} \right\}$;
the sub-sets have size $\le r$, and might be also empty, and their order in the list counts.

so that it is

$$ L_{\,b\,} (s,r,m) = \sum\limits_{\left( {0\, \le } \right)\,\,k\,\,\left( { \le \,m} \right)} {\left\{ \matrix{ s \cr k \cr} \right\}_{\,r} m^{\,\underline {\,k\,} } } \;\;:\quad L_{\,b\,} (s,s,m) = m^{\,s} $$

Then, denoting with $c_1, c_2,\ldots, c_m$ the size of the $m$ subsets, these will represent a weak composition of $s$ into $m$ parts not greater than $r$, and the number of ways to compose the $m$ subsets will be $$ \eqalign{ & L_{\,b\,} (s,r,m) = \cr & = \sum\limits_{\left\{ {\matrix{ {0\, \le \,c_{\,j} \, \le \,r} \cr {c_{\,1} + c_{\,2} + \, \cdots + c_{\,m} = s} \cr } } \right.} {\left( \matrix{ s \cr c_{\,1} \cr} \right)\left( \matrix{ s - c_{\,1} \cr \,c_{\,2} \cr} \right) \cdots \left( \matrix{ s - c_{\,1} - \,c_{\,2} - \cdots - c_{\,m - 1} \cr \,c_{\,m} \cr} \right)} = \cr & = \sum\limits_{\left\{ {\matrix{ {0\, \le \,c_{\,j} \, \le \,r} \cr {c_{\,1} + c_{\,2} + \, \cdots + c_{\,m} = s} \cr } } \right.} {\left( \matrix{ s \cr c_{\,1} ,\,c_{\,2} ,\, \cdots ,c_{\,m} \cr} \right)} \cr} $$

Finally we can split $L_b$ according to the exact number ($j$ in the addends below) of the bins saturated at the max capacity $r$ $$ \eqalign{ & L_{\,b\,} (s,r,m) = \cr & = \sum\limits_{\left\{ {\matrix{ {0\, \le \,c_{\,j} \, \le \,r} \cr {c_{\,1} + c_{\,2} + \, \cdots + c_{\,m} = s} \cr } } \right.} {\left( \matrix{ s \cr c_{\,1} ,\,c_{\,2} ,\, \cdots ,c_{\,m} \cr} \right)} = \cr & = \sum\limits_{\left\{ {\matrix{ {0\, \le \,c_{\,1} ,\,c_{\,2} ,\, \cdots ,\,c_{\,m} \, < \,r} \cr {c_{\,1} + c_{\,2} + \, \cdots + c_{\,m} = s} \cr } } \right.} {\left( \matrix{ s \cr c_{\,1} ,\,c_{\,2} ,\, \cdots ,c_{\,m} \cr} \right)} + \cr & + \sum\limits_{\left\{ {\matrix{ {0\, \le \,c_{\,1} ,\,c_{\,2} ,\, \cdots ,\,c_{\,m - 1} \, < \,r} \cr {c_{\,1} + c_{\,2} + \, \cdots + c_{\,m - 1} = s - r} \cr } } \right.} {\left( \matrix{ m \cr 1 \cr} \right)\left( \matrix{ s \cr c_{\,1} ,\,c_{\,2} ,\, \cdots ,c_{\,m - 1} ,r \cr} \right)} + \cr & + \sum\limits_{\left\{ {\matrix{ {0\, \le \,c_{\,1} ,\,c_{\,2} ,\, \cdots ,\,c_{\,m - 2} \, < \,r} \cr {c_{\,1} + c_{\,2} + \, \cdots + c_{\,m - 2} = s - 2r} \cr } } \right.} {\left( \matrix{ m \cr 2 \cr} \right)\left( \matrix{ s \cr c_{\,1} ,\,c_{\,2} ,\, \cdots ,c_{\,m - 2} ,r,r \cr} \right)} + \cr & \quad \quad \quad \quad \quad \quad \quad \quad \vdots \cr & + \sum\limits_{\left\{ {\matrix{ {0\, \le \,c_{\,1} ,\,c_{\,2} ,\, \cdots ,\,c_{\,m - 2} \, < \,r} \cr {c_{\,1} + c_{\,2} + \, \cdots + c_{\,m - \left\lfloor {s/r} \right\rfloor } = s - \left\lfloor {s/r} \right\rfloor r} \cr } } \right.} {\left( \matrix{ m \cr \left\lfloor {s/r} \right\rfloor \cr} \right)\left( \matrix{ s \cr c_{\,1} ,\,\, \cdots ,c_{\,m - \left\lfloor {s/r} \right\rfloor } ,\underbrace {r, \cdots ,r}_{\left\lfloor {s/r} \right\rfloor } \cr} \right)} = \cr & = \sum\limits_{0\, \le \,j\, \le \,\left\lfloor {s/r} \right\rfloor } {\left( \matrix{ m \cr j \cr} \right){{s!} \over {\left( {s - j\,r} \right)!\left( {r!} \right)^{\,j} }}L_{\,b\,} (s - j\,r,\;r - 1,\;m - j)} \cr} $$ or adding the initial conditions, so that it can be used also as a recurrence $$ \bbox[lightyellow] { \eqalign{ & L_{\,b\,} (s,r,m) = \sum\limits_{\left\{ {\matrix{ {0\, \le \,c_{\,j} \, \le \,r} \cr {c_{\,1} + c_{\,2} + \, \cdots + c_{\,m} = s} \cr } } \right.} {\left( \matrix{ s \cr c_{\,1} ,\,c_{\,2} ,\, \cdots ,c_{\,m} \cr} \right)} = \cr & = \left[ {0 = r = s} \right] + \sum\limits_{\left( {0\, \le } \right)\,j\, \le \,\left\lfloor {s/r} \right\rfloor } {\left( \matrix{ m \cr j \cr} \right){{s!} \over {\left( {s - j\,r} \right)!\left( {r!} \right)^{\,j} }}L_{\,b\,} (s - j\,r,\;r - 1,\;m - j)} = \cr & = \left[ {0 = r = s} \right] + \sum\limits_{\left( {0\, \le } \right)\,j\,\left( { \le \,\left\lfloor {s/r} \right\rfloor \, \le \,m} \right)} {\left( \matrix{ m \cr j \cr} \right)\left( \matrix{ s \cr j\,r \cr} \right){{\left( {j\,r} \right)!} \over {\left( {r!} \right)^{\,j} }}L_{\,b\,} (s - j\,r,\;r - 1,\;m - j)} \cr} }$$ where the square brackets at the beginning are Iverson bracket.

Best Answer

Related Solutions

Probability – Balls in Bins, Probability That Exactly Two Bins Are Empty

Variance and mean of balls in bins limited capacity

Related Question