[Math] Is this Expected value $E[\log(n_{j})] = \log(n)/m$ correct

probability

Given $n = n_0 + n_1 + \cdots+ n_{m-1}$ and the average value of $n_{j}$ is $E[n_{j}] = n/m$.
(Here $E$ is for expectation in context of probability). Here $n$ is not random variable, but $n_{j}$ is random variable.

Does that mean $E[\log(n_{j})] = \log(n)/m$ ? If no, can we find a function $f$ such that $E[\log(n_{j})] = f(n,m)$ ?

PS This is in context of analysis of hashing with chaining with
Simple uniform hashing assumption i.e. any given element is equally likely to hash into any of the $m$ slots, independently of where any other element has hashed to. Also $\alpha = n/m$ is called load factor.

Best Answer

You seem to be thinking of a multinomial scenario: we throw $n$ balls in $m$ equiprobable bins, $n_j$ is the amounts of balls in bin $j$ (multinomial distribution). In that case, each $n_j$ is a binomial distribution, with $E(n_j)=n p$ and $\sigma^2_{n_j}= n p (1-p)$, with $p=1/m$

You wish to compute $E[\log(n_j)]$ . But that's ill defined (actually, it diverges) because $P(n_j=0)>0$ (we could cheat and compute instead $E[\log(n_j+1)]$, or set $\log(0) p(0)=0$) I doesn't seem to have a simple closed form expression. Using the general Taylor expansion

$$E[g(X)] = g(\mu) + \frac{1}{2!} g^{''}(\mu) m_2 + \frac{1}{3!} g^{[3]}(\mu) m_3 + \cdots $$

where $\mu = E[X]$ and $m_k = E[(X-\mu)^k]$, we get, in our case, with $X: {\rm Binom}(n,p)$, $g(X)= \log(X)$:

$$E[\log(X)] \approx \log(n p) - \frac{1-p}{2 n p} + \cdots = \log(\alpha) - \frac{1-p}{2 \alpha} + \cdots $$

with $\alpha=n/m = n p $, but this does not seem very useful as an asympotics with fixed $\alpha$. Of course, you have the bound (already given by Jensen) $E[\log(n_j)] \le \log E(n_j) = \log(\alpha)$

Related Question