An inequality related to the Renyi divergence

entropyinformation theoryprobability distributions

Can you prove the following?

Conjecture. Let $\lambda > 1$. Let $p_i$, $q_i$, $\mu_i$, $\nu_i$ be probability densities over $\mathbb R$ for $i = 1, …, n$, such that for all $i = 1, …, n$, (all integrations are over $\mathbb R$)

$$\int {p_i(x)^\lambda \over q_i(x)^{\lambda – 1}} dx \le \int {\mu_i(x)^\lambda \over \nu_i(x)^{\lambda – 1}} dx$$

Then

$$\int {(\sum_i p_i(x))^\lambda \over (\sum_i q_i(x))^{\lambda – 1}} dx \le \int {(\sum_i \mu_i(x))^\lambda \over (\sum_i \nu_i(x))^{\lambda – 1}} dx $$

[End of Conjecture]

The Conjecture is equivalent to its special case when $n = 2$ (by an induction argument).

Where does this conjecture come from? Well, let $p$ and $q$ be two probability densities, then the Renyi divergence $D_\lambda(p || q)$ is defined by

$$D_\lambda(p || q) = {1 \over \lambda – 1} \log \int {p(x)^\lambda \over q(x)^{\lambda – 1}} dx.$$

Like the KL divergence ($\lambda = 1$), it measures the difference between the two densities. And the Conjecture basically says: If each $p_i$ and $q_i$ are closer than each $\mu_i$ and $\nu_i$, then so are their averages:

$$D_\lambda(p_i || q_i) \le D_\lambda(\mu_i || \nu_i)$$

$$\Rightarrow$$

$$D_\lambda(n^{-1} \sum_i p_i || n^{-1} \sum_i q_i) \le D_\lambda(n^{-1} \sum_i \mu_i || n^{-1} \sum_i \nu_i).$$

Alternatively, can you prove the Conjecture specialised to Gaussian distributions:

$$p_i \sim N(\alpha_i, \sigma^2);\qquad q_i \sim N(\beta_i, \sigma^2);\qquad \mu_i \sim N(\delta_i, \sigma^2);\qquad \nu_i \sim N(\gamma_i, \sigma^2)$$

where $|\alpha_i – \beta_i| \le |\delta_i – \gamma_i|$.

Note that

$$D_\lambda(N(\alpha, \sigma^2) || N(\beta, \sigma^2)) = {\lambda (\alpha – \beta)^2 \over 2 \sigma^2}.$$

Best Answer

The conjecture is false. Here is a simple counter-example with Gaussians and $n=2$: \begin{align}p_1 &\sim N(-1,1), \quad q_1 \sim N(1,1), \quad u_1 \sim N(-2,1), \quad v_1 \sim N(2,1)\\ p_2& \sim N(-1,1), \quad q_2 \sim N(1,1), \quad u_2 \sim N(2,1), \quad v_2 \sim N(-2,1). \end{align}

Then, $$D_\lambda(p_1\Vert q_1) = D_\lambda(p_2\Vert q_2) = 2\lambda <8 \lambda = D_\lambda(u_1\Vert v_1) = D_\lambda(u_2\Vert v_2) $$

However, $p_1 + p_2 \ne q_1 + q_2$ while $u_1 + u_2 = v_1 + v_2$, so $$D_\lambda((u_1+u_2)/2\Vert (v_1 + v_2)/2) = 0 < D_\lambda((p_1+p_2)/2\Vert (q_1 + q_2)/2).$$

Related Question