[Math] Bins and balls problem: expected number of balls placed given number of empty and non-empty bins

balls-in-binscombinatoricsprobabilityprobability distributions

Given are $m$ bins with equal probability of choosing one of them. Unknown number of balls $n$ is placed into the bins, and, at the end of placement, we observe number of empty bins $m_e$ and non-empty bins $m_{n}$.

Given $m$, $m_e$, $m_n$, what is the most likely number of balls $n$, which have been placed into bins?

(UPD) possible additional information: number of bins with exactly one ball can be also known.

Best Answer

Presumably $m=m_e+m_n$.

The likelihood of seeing $m_n$ out of $m$ occupied is $\dfrac{S_2(n,m_n)\, m!}{m^n \;(m-m_n)!}$ where $S_2(x,y)$ is a Stirling number of the second kind. I do not know an easy way of finding the $n$ which maximises this in general but it is easy enough to calculate for given reasonably small $m$.

One possible alternative estimator is $\dfrac{\log(m)-\log(m-m_n)}{\log(m)-\log(m-1)}$, which follows from saying $E[m_n \mid m, n] = m\left(1 -\left(\frac{m-1}{m}\right)^n\right)$. This is neither a maximum likelihood estimator nor an unbiased estimator for $n$, and it is usually not an integer, but it comes fairly close to the maximum likelihood estimator; it is slightly biased upwards when $1 \lt n \lt m$. The table below compares the values when $m=10$ for different $m_n$

 m      m_n   n_MLE    n_alt.est 
10       0     0        0   
10       1     1        1   
10       2     2        2.12  
10       3     3        3.39
10       4     4 or 5   4.85
10       5     6        6.58
10       6     8        8.70
10       7    11       11.43
10       8    15       15.25 
10       9    22       21.85
10      10    infinite  n/a
Related Question