[Math] Positivity of Renyi Mutual Information

information theory

The differential Renyi entropy for a probability distribution is given by $H_q(P(X))=\frac{1}{1-q}\log\int p^q(x)dx$. In the limit of $q\to 1$, it reduces to the usual Shannon entropy. We can write down the mutual information between two variables X and Y simply by $I(X;Y)=H_q(P(X))+H_q(P(Y))-H_q(P(X,Y))$. Is this always a non-negative quantity? Again, in the case $q=1$ it is very easy to show it, but what about in general?

Best Answer

EDIT. I justify the positivity of the Renyi mutual information using its interpretation as Renyi divergence. I follow the expositions in

T. Cover, J. A. Thomas "Elements of Information Theory" (chapter 2)

and

D. Xu, D. Erdogmuns "Renyi's Entropy, Divergence and their Nonparametric Estimators"

  • Shannon entropy and mutual information

In the setting of "classical" information theory the mutual information $I(X,Y)$ of the random variables $X$ and $Y$ is defined as

$$I(X,Y):=D_{KL}(p_{XY}||p_Xq_Y),$$

where $D_{KL}(p_{XY}||p_Xq_Y),$ denotes the Kullback Leibler divergence (KL divergence) between the joint probability $p_{XY}$ and the product $p_Xq_Y$ of the prob. distribution of $X$ and $Y$.

Using the Jensen inequality on the KL divergence it follows that $I(X,Y)$ is always non negative. I refer to the first reference for the computation in the discrete case.

Introducing the Shannon entropies $H(X)$ $H(Y)$ of $X$ resp. $Y$ and the conditional entropy $H(X|Y)$ we arrive at the equivalent formulation

$$I(X,Y)=H(X)+H(Y)-H(X|Y).$$

  • Renyi Entropy and mutual information

Let us consider the Renyi $\alpha$-setting , now. With

$$H_{\alpha}(X)=\frac{1}{1-\alpha}\log\int p^{\alpha}_X(x)dx$$

we denote the Renyi entropy of the r.v. $X$. The Renyi divergence of the distribution $g(x)$ from the distribution $f(x)$ is

$$D_{\alpha}(f||g):=\frac{1}{\alpha-1}\log\int f(x)\left(\frac{f(x)}{g(x)}\right)^{\alpha-1}dx.$$

It can be proved that (please see the second reference at pag.81)

$$D_{\alpha}(f||g)\geq 0 \forall ~f, g, \text{and}~\alpha>0,~~(*)$$ $$\lim_{\alpha\rightarrow 1}D_{\alpha}(f||g)=D_{1}(f||g)=D_{KL}(f||g).~~(*)$$

The Renyi mutual information $I_{\alpha}(X,Y)$ is defined naturally as the Renyi divergence between the joint distribution $p_{XY}$ of $X$ and $Y$ and the product of the marginal distributions $p_X$, $q_Y$, i.e.

$$I_{\alpha}(X,Y):=D_{\alpha}(p_{XY}||p_Xq_Y).$$

This is a definition; you can find it, for example, at pag. 83 in the second reference. You can justify it through the overall $\alpha$-setting and the limit

$$\lim_{\alpha\rightarrow 1}I_{\alpha}(X,Y)=I(X,Y),$$

which follows from property $(**)$ of the Renyi divergence. This limit is parallel to the fundamental $\lim_{\alpha\rightarrow 1}H_{\alpha}(X)=H(X):$

From property $(*)$ one derives nonnegativity of the Renyi mutual information.

For these reasons, I would prove non negativity of the Renyi mutual information through the above definition. At the present stage I haven't been able to prove that

$$I_{\alpha}(X,Y)=H_{\alpha}(X)+H_{\alpha}(Y)-H_{\alpha}(X|Y),$$

or to find such characterization in the literature. Even in the discrete case I got blocked because of the coefficient $\frac{1}{1-\alpha}$ in front of the entropies. The cases $0<\alpha<1$ and $\alpha>1$ must be studied separately and it seems that a straightforward application of Jensen's inequality is not possible.

Related Question