[Math] How to maximize entropy

entropyprobabilityproof-verification

In the book on probability I am reading, I am asked to prove that the entropy of $X$ is maximized when $X$ is uniformly distributed.

At first I came up empty and decided to check online. Most proofs made use of the AM-GM inequality which the book did not cover, so I was wondering if I could come up with a proof that relied on only things in the book.

I try using the fact that
$$\begin{align} \ln x \le x-1 <x&\Rightarrow -x\ln x > -x^2 \\ &\Rightarrow -\sum_i p_i\ln p_i >- \sum_i p_i^2 \\ &\Rightarrow H(X) >- \sum_i p_i^2 \end{align}$$
To maximize $H(x)$ we should maximize $F(\mathbf {p} )=-\sum_i p_i^2$ subject to the constraint $g(\mathbf {p} )=\sum_i p_i=1$. Using the method of Lagrange multipliers, we get that $p_i=p_j \quad \forall i\ne j$.

I would like to know if this argument is correct or if there are easier ways to prove the result given that the book assumes knowledge of differential equations, multivariable calculus and linear algebra.

I was also wondering if I could apply the method of Lagrange Multipliers to the entropy itself.

Best Answer

One approach:

  • Say your probability distribution takes values in $\{1,\cdots,n\}$.
  • Then $H(X) = \mathbb{E}[\log\frac{1}{p(X)}] \le \log \mathbb{E} \left[ \frac{1}{p(X)} \right] = \log n$, by Jensen's inequality applied to the concave function $f(x)=\log x$.
  • Equality holds when $p(X)$ is constant, i.e. when $X$ is uniformly distributed.
Related Question