Is this proof that relative entropy is never negative correct

entropyprobabilitysolution-verificationstatistical-inferencestatistics

I wish to prove that relative entropy(Kullback-Liebler divergence) is always non-negative.
I.e. that $$I^{KL}(F;G)=E_F\left[\log\frac{f(X)}{g(X)}\right]\geq0$$
where F,G are two different probability distributions.

(there is a much shorter proof for the case where F,G are continuous but I'm curious to see the following version is correct because it would capture both the continuous and discrete case)

The proof will make use of :

1.Jensen's inequality: $$E(h(X))\geq h(E(X))$$ for a convex function h(x).

2.The fact that entropy $E_F[\log f(X)]$ is always positive.

Proof:

$I^{KL}(F;G)=E_F\left[\log\frac{f(X)}{g(X)}\right]$
$=E_F[\log f(X)]-E_F[\log (g(X)]$

log(x) is concave, therefore h(x)=-\log(x) is convex as required.

$-E_F[\log (g(X)]=E_F[-\log (g(X)]$ (by linearity of expectation)

$E_F[-\log (g(X)]\geq -\log E_F[g(X)]$ by Jensen's inequality.

Now:
g is a probability density(or mass) function for the random variable X, thus $0\leq g(x)\leq 1$ for all possible values x of X.
$\implies 0\leq g(X)\leq 1$

$\implies 0\leq E(g(X)) \leq 1 $

$\implies \log[E(g(X))] \leq 0$

$\implies -\log[E(g(X))] \geq 0$

$\implies E_F[-\log (g(X)]\geq -\log[E(g(X))] \geq 0$

Therefore $-E_F[\log (g(X)]\geq 0$ and thus $I^{KL}(F;G) \geq 0$

Best Answer

There are several mistake in what you have done. It is by no means to true that density functions $g$ satisfy the inequalities $0 \leq g(x) \leq 1$. For example $2I_{(0,\frac1 2 )}$ is a density function. Secondly I don't know how you finish the proof because you have ended up with a negative number plus a positive number and there is no reason why the sum is positive.

Here is a proof: $E_F(\log \frac {f(X)} {g(X)}) =-E_F(-\log \frac {f(X)} {g(X)})=-E_F \log \frac {g(X)} {f(X)}=E_F (-\log \frac {g(X)} {f(X)}) \geq -\log ( E_F \frac {g(X)} {f(X)})$ by Jensen's Inequality. Now $E_F (\frac {g(X)} {f(X)})=\int \frac {g(x)} {f(x)} f(x)dx=\int g(x)dx=1$ and $\log 1=0$.