[Math] Is mutual information convex in the joint distribution

convex-analysisentropyinformation theory

Assume some joint distribution $P(X,Y) = P(Y|X)P(X)$.

It is well know that, for fixed $P(Y|X)$, mutual information is a concave function of $P(X)$ and, for fixed $P(X)$, a convex function of $P(Y|X)$ (e.g. see Theorem 2.7.4 in Cover & Thomas).

Is the mutual information function convex in the joint distribution $P(X,Y)$, given the marginals are not fixed? I have read conflicting statements. For example,

  • M Mihm, K Ozbek, "Decision Making with Rational Inattention" (pdf) says it is convex (see statement right after eq.3)

  • AG Stefani et al, "A Tight Lower Bound on the Mutual Information of a Binary and an Arbitrary Finite Random Variable in Dependence of the Variational Distance" (pdf) says it is not convex (first paragraph).

I suspect it is neither concave nor convex but am looking for a definitive statement.

Best Answer

It is neither convex nor concave. You can work it out using Bernoulli random variables.

Not convex:

1) Let $X$ and $Y$ be i.i.d. Bernoulli with $Pr[X=1]=1/2$. Then $I(X,Y)=0$.

2) Let $A=B=1$ (constants). Then $I(A,B)=0$.

3) Let $(W,Z) = \left\{\begin{array}{ll} (X,Y) & \mbox{with prob 1/2} \\ (A,B) & \mbox{with prob 1/2} \end{array}\right. $

The joint probability distribution for $(W,Z)$ is a convex combination of those for $(X,Y)$ and $(A,B)$. Yet, $I(W,Z)>0$.

Not concave:

1) Let $X$ and $Y$ be i.i.d. Bernoulli with $Pr[X=1]=1/2$. Then $I(X,Y)=0$.

2) Let $A=B$ where $A$ is Bernoulli with $Pr[A=1]=1/2$. Then $I(A,B)=1$.

3) Let $(W,Z) = \left\{\begin{array}{ll} (X,Y) & \mbox{with prob 1/2} \\ (A,B) & \mbox{with prob 1/2} \end{array}\right. $

Then $Pr[W=1]=Pr[Z=1]=1/2$ and: \begin{align} Pr[(W,Z)=(0,0)] &= 3/8\\ Pr[(W,Z)=(0,1)] &= 1/8\\ Pr[(W,Z)=(1,0)] &= 1/8\\ Pr[(W,Z)=(1,1)] &= 3/8 \end{align} and you can show that $I(W,Z) < 1/2 = (1/2)I(X,Y)+(1/2)I(A,B)$.