[Math] Proving that Shannon entropy is maximal for the uniform distribution using convexity

entropyinformation theoryprobabilitystatistical mechanicsuniform distribution

I need to show that $-\sum_\limits i{p_i \log{p_i}}$ is maximal iff $p_i=p_j$ for all $i\neq j$ using the convexity inequality:

$$\phi \left(\frac{\sum{a_i}}{N}\right)\leq \frac{\sum{\phi (a_i)}}{N}$$

I tried expanding with cross-entropy and KL-divergence

$$-\sum_i{\frac{1}{N} \log{\frac{1}{N}}}=-\sum_i{\frac{1}{N} \log{p_i}}-\sum_i{\frac{1}{N} \log{\frac{N}{p_i}}}$$

But I got only trivial answers like $\log N \leq \log N$

Best Answer

Let $\phi(x) = x\log x$. This function is convex in the set $[0, 1]$.

The entropy $S$ is defined as:

$$S(p) = -\sum \phi(p_i),$$

where $p=[p_1, p_2, \ldots, p_N]$.

Then, using the Jensen's inequality, you get that:

$$ \phi \left(\frac{\sum{p_i}}{N}\right)\leq \frac{\sum{\phi (p_i)}}{N} \Rightarrow \phi \left(\frac{\sum{p_i}}{N}\right)\leq -\frac{S(p)}{N} \Rightarrow S(p) \leq -N \phi \left(\frac{\sum{p_i}}{N}\right).$$

Notice that, whichever is $p$, then by definition $\sum p_i = 1$, and hence:

$$S(p) \leq -N\phi\left(\frac{1}{N}\right) = -N\left(\frac{1}{N}\log\frac{1}{N}\right) = \log N.$$

This means that the entropy is at most equal to $\log N$.

Now, notice that $S(p) = \log(N)$ for $p = \left[\frac{1}{N}, \ldots, \frac{1}{N}\right]$.

Then: $$p_i = \frac{1}{N} \forall i \implies S(p) ~\text{is maximum}.$$

We conclude the proof by observing that the maximum $p_i = \frac{1}{N} ~\forall i$ is unique since $S(p)$ is strictly concave.

Related Question