[Math] Geometric distribution achieves maximum entropy for given mean

entropyinformation theoryrandom variables

Let $X$ be a random variable with geometric distribution, ie $P(X=k)=p(1-p)^k$. If I calculated it correctly, $X$ has mean $E(X)=\frac{1-p}p$ and entropy $H(X)=-\log p – \frac{1-p}p\log{(1-p)}$ (edited because I want to start at $0$). Now I'd like to show there is no non-negative integer valued random variable Y with $E(Y)=E(X)$ and $H(Y)>H(X)$.

There is something on Wikipedia but I can't quite follow it. Is there an elementary way to show a maximum entropy distribution must be of the form $Cr^k$? How does the maximality of the geometric distributon follow?

/edit: Just realised it's not geometric but almost: $k$ instead of $k-1$ in the exponent.

Best Answer

Take your geometric random variable distributed as $p_k=\mathbb P (X=k)=p(1-p)^{k}$ for $k\in\{0,1,2,\dots\}$. Note that your entropy is: $$ H(X)=-\log p-\frac{1-p}{p}\log(1-p). $$

And consider another random variable $Y$ distributed as $q_k=\mathbb P (Y=k)$ such that: $$ \frac{1-p}{p}=\mathbb E(X)=\mathbb E(Y)=\sum_{i=0}^\infty i q_i. $$ We use following lemma, which is nothing but positiveness of KL-divergence:

Lemma For the random variables $X$ and $Y$ distributed as $p_k\neq 0$ and $q_k$ on $k\in\{0,1,2,\dots\}$, we have $$ \mathbb D(Y|X)=\sum_0^\infty q_i\log \frac{q_i}{p_i}\geq 0. $$

The proof of lemma is really simple and it is omitted. Now use the lemma to see that :

$$ H(Y)=-\sum q_i\log q_i\leq -\sum q_i\log p_i\\ =-\sum q_i (\log p+i\log(1-p))=-\log p-\log(1-p)\mathbb E(Y)\\ =-\log p-\frac{1-p}{p}\log(1-p)=H(X). $$

Related Question