[Math] Is this a situation where triple mutual information is always non-negative

it.information-theorymarkov chainspr.probability

Suppose I have three identically-distributed homogeneous continuous-time discrete state space Markov chains $X_1(t), X_2(t), X_3(t)$, $t\geq 0$. They evolve independently but share a common random variable $X_0$ as an initial condition.
I let $$X_1=X_1(t_1), \ \ \ X_2=X_2(t_2), \ \ \ X_3=X_3(t_3)$$ for some times $t_1, t_2, t_3\geq 0$.

I want to show that
$$
I(X_1; X_2; X_3) \geq 0
$$
where $I$ is the multivariate mutual information (or information interaction)
$$
I(A,B,C)= H(A,B,C) -H(A,B) – H(B,C) – H(A,C) + H(A) + H(B) + H(C)
$$
where $H$ is the usual Shannon entropy.

Background/Motivation

There are well-known situations where $I(A;B;C)<0$, a famous one being if $A$ and $B$ are independent random variables, each $\pm 1$ with probability $1/2$, and $C=AB$. But I conjecture that in the case I have described above $I(X_1;X_2;X_3)\geq 0$. I believe that the Markov chains being continuous-time and homogeneous is essential.

The more general motivation is that I want to find very general situations where multivariate mutual information is non-negative. (One well-known example is if $A,B,C$ form a Markov chain.)

Best Answer

You are touching a treacherous place that coincidentally intersects with some of my own research. I have several responses.

(1) If at all possible---stay out the synergy/redundancy waters. Instead, see if either of the two known non-negative generalizations of mutual information fit your needs. They are:

(a) "Total Correlation": http://en.wikipedia.org/wiki/Total_correlation

(b) "Dual Total Correlation": http://en.wikipedia.org/wiki/Dual_total_correlation

I personally think the Dual Total Correlation makes a lot more sense, but that's just my opinion.

If you really want to go into the synergy/redundancy waters, here's the deal---

(2) The "triple mutual information" you refer to is actually the redundant information minus the synergistic information. Therefore the triple mutual information will be nonnegative anytime redundancy >= synergy. Here's a paper that describes this, http://arxiv.org/pdf/1004.2515.pdf . Note that the above paper doesn't properly define the "redundant information" among variables, but it does correctly show that the "triple mutual information" is redundancy minus synergy.

(3) There have been attempts since the above-cited paper for how to properly define the "redundant information". As of this writing it's an unsolved problem. Here's a place to start if you want to get into this. http://arxiv.org/abs/1310.1538

Related Question