Variance and mean of balls in bins limited capacity

combinationscombinatoricsmeansvariance

Let there be $m$ indistinguishable balls, $k$ bins, $C$ capacity.
Let $X_j$ denote the total balls in bin $j$.
I've seen ways to calculate the total number of combinations, but I'm not sure how to go about calculating the mean and variance of $X_j$.
It is understood that the ball is always thrown into one of the empty bins with uniform probability. edit: nonfull bins not empty bins sorry.

Best Answer

This seems computationally very difficult to me. You might do better with simulation.

It $C\ge m$ then the constraint has no effect, and we are just looking for the mean and variance of the number of balls in bin $j.$ This has a binomial distribution, and the answer is well known.

When ${m\over k}\leq C< M,$ the situation is much more complicated, because the result depends on the sequence in which the balls were thrown into the bins. Before any bin fills up, a ball has probability $1/k$ of landing in bin $j.$ After some other ball fills up, it has probability $1/(k-1)$ of landing in bin $j$. The situation is even more complicated if it's possible for more than one ball to fill up.

Consider the case $k=3, j=1.$ Let $P(x_1,x_2,x_3)$ be the probablity that at some point there are $x_i$ balls in bin $i$ for $i=1,2,3.$ To compute the expectation, we have to sum up terms of the form $aP(a,b,c)$ where $a+b+c=m.$ If $\max\{a,b,c\}<C,$ we have simply $$P(a,b,c)={m\choose a,b,c}3^{-n}\tag{1}$$ But suppose $c=C.$ Then $$P(a,b,c)=P(a,b,C)=\frac12P(a-1,b,C)+\frac12P(a,b-1,C)+\frac13P(a,b,C-1)\tag{2}$$ taking account of all bins the last ball might have been thrown into.

The last term on the right-hand side of $(2)$ can be computed directly from $(1),$ but the first two have to be computed recursively from $(2)$. It won't take many bins, or many balls, before this computation becomes unwieldy, if the capacity constraint is effective.

There are so many possible sequences of throws that memoization is unlikely to be effective, so far as I can see. I've been tying to think of ways to simplify the calculations, but so far, I don't have a glimmer.