[Math] An Inequality of KL Divergence

it.information-theorypr.probabilityst.statistics

Given two probability distributions $P$ and $Q$ defined over a finite set $\mathcal{X}$, one can define the KL divergence between $P$ and $Q$ as
$$D(P||Q):=\sum_{x\in \mathcal{X}}P(x)\log\frac{P(x)}{Q(x)}.$$
It is known that KL divergence is not a distance (not symmetric and also does not satisfy the triangle inequality).

Let $\lambda\in (0, \frac{1}{2})$ and let $R_{\lambda}:=\lambda P+(1-\lambda)Q$.
Two distributions $P$ and $Q$ defined over $\mathcal{X}=\{\pm k,\pm(k-1), \dots, \pm 2, \pm 1\}$, is said to be symmetry of each other if $P(x)=Q(-x)$ for $x\in \mathcal{X}$.

I want to show that the following inequality always hold for symmetric distributions $P$ and $Q$
$$\frac{D(P||R_{1-\lambda})}{D(P||R_{\lambda})}< \frac{\log(1-\lambda)}{\log(\lambda)}.$$

It is easy to see that $\lambda\mapsto D(P||R_{\lambda})$ is convex and tends to zero as $\lambda\to 1$ and hence it is strictly decreasing on $[0, 1]$ and similarly $\lambda\mapsto D(P||R_{1-\lambda})$ is convex and strictly increasing and therefore $\lambda\mapsto \frac{D(P||R_{1-\lambda})}{D(P||R_{\lambda})}$ is strictly increasing.
The following is the plot of two sides of above inequality when
$$Q=[0.06, 0.03, 0.05, 0.04, 0.02, 0.01, 0.04, 0.03, 0.02 , 0.04, 0.04, 0.09, 0.03, 0.25, 0.15, 0.1]$$ and
$$P=0.1, 0.15, 0.25, 0.03, 0.09, 0.04, 0.04, 0.02, 0.03, 0.04, 0.01, 0.02, 0.04, 0.05, 0.03, 0.06]$$
Red is the plot of the left-hand side and blue represents the right-hand side.
enter image description here

Thanks to Iosif Pinelis, now we know that the above inequality does not hold for arbitrary $P$ and $Q$.

I will be very thankful if you can let me know of any reference related to this problem.

Best Answer

The inequality $$\frac{D(P||R_{1-\lambda})}{D(P||R_{\lambda})} < \frac{\log(1-\lambda)}{\log\lambda} \quad\text{for all $\lambda\in(0,1/2)$}$$ is not true in general. E.g., take $P=\frac1{100}\,(57,42,1)$, $Q=\frac1{100}\,(41, 12, 47)$, and $\lambda=\frac1{10}$. Then the left-hand side of this inequality and its right-hand side are $0.0537\dots$ and $0.0457\dots$, respectively.

However, it seems likely that this inequality holds if it is additionally assumed that $P(x)=Q(-x)$ for all $x$.


Addendum

The previous answer was given not assuming the "symmetry" condition.
Given now this condition, write

$$D(P||R_t)=\sum_{|x|=1}^k P(x)\ln\frac{P(x)}{t P(x)+(1-t)P(-x)}$$ $$=\sum_{x=1}^k \Big(P(x)\ln\frac{P(x)}{t P(x)+(1-t)P(-x)} +P(-x)\ln\frac{P(-x)}{t P(-x)+(1-t)P(x)}\Big) $$ $$=\sum_{x=1}^k P(x)F(t,r(x))\,\ln\frac1t $$ for $t\in(0,1)$, where $r(x):=P(-x)/P(x)$ and $$F(t,u):=\frac1{\ln(1/t)}\Big(\ln\frac1{t +(1-t)u} +u\ln\frac1{t +(1-t)/u}\Big). $$ Here we are assuming that $P(x)>0$ for all $x$; otherwise, extend by the continuity.

So, the problem is reduced to showing that $$(*)\qquad \Delta(t,u):=F(t,u)-F(1-t,u)>0$$ for $t\in(0,1/2)$ and $u\in(0,1)\cup(1,\infty)$ (clearly, $\Delta(t,1)=0$). By plotting, inequality $(*)$ appears quite likely, and it should be not very hard to prove it by calculus, as the problem now involves only two real variables, $t$ and $u$.

Here the graphs of $\Delta(t,u)$ and $\text{sgn}\,\Delta(t,u)$ are shown for $t\in(0,1/2)$ and $0<u<5$: enter image description here

Addendum 2

It is indeed not very hard to show that $(*)$ is true. Note that $\Delta(t,1/u)=\Delta(t,u)/u$ for $u>0$. So, without loss of generality $u>1$. That is, it suffices to show that $$h(u):=\Delta(t,u)\ln(1-t)\ln t $$ is positive for $t\in(0,1/2)$ and $u>1$. Let $$h_2(u):=h''(u)\frac{(1 -t + t u)^2 (t + (1- t) u)^2 u}{1 + u} $$ $$= t^2 [(1- t)^2 -(1 - 4 t + 2 t^2) u + (1- t)^2 u^2] \ln t -(1- t)^2 (t^2 (u-1)^2 + u) \ln(1 - t) . $$ So, $h_2$ is a quadratic polynomial in $u$, with the coefficient of $u^2$ equal to
$-(1- t)^2 t^2 \ln(-1+1/t)<0$ for $t\in(0,1/2)$. So, $h_2$ is concave and $h_2(\infty-)=-\infty$.

Next, $h_2(1)=g(t):=t^2 \ln t-(1- t)^2 \ln(1-t)$, $g(0+)=g(1/2)$ and $g''(t)=-2\ln(-1+1/t)<0$, so that $g$ is concave and hence $h_2(1)>0$ for $t\in(0,1/2)$. Because $h_2$ is concave and $h_2(\infty-)=-\infty$, it follows that $h_2(u)$ and hence $h''(u)$ switch its sign exactly once, from $+$ to $-$, as $u$ increases from $1$ to $\infty$. So, for some real $c=c(t)>1$, the function $h$ is convex on $[1,c]$ and concave on $[c,\infty)$. At that, $h(1)=h'(1)=0$ and $h(\infty-)=\infty$. So, $h>0$ on $(1,\infty)$. That is, indeed $\Delta(t,u)>0$ for $t\in(0,1/2)$ and $u>1$. \qed

Related Question