Solved – Minimizing the mutual information

entropymutual information

Lets say I have two independent random variables $T$ and $D$ both over finite integers sets $S_T$ and $S_D$ respectively.

Also, assume the probability function of $D$ is given to us and is only defined on non-negative integers, but the probability function of $T$ should be defined by us.

So we can assume the range of $T$ and $D$ are $\{-m_T,…,m_T\}$ and $\{0,…,m_D\}$ respectively.

Now I define another random variable "observed", as $O = T-D$ and would like to minimize the mutual information of $I(D; O)$ by defining the probability function of $T$.

I am trying to see whether it is possibly to make the mutual information zero? If not, what distribution $T$ should have to achieve the minimum mutual information.

Best Answer

Mutual information of two random variables will be zero when the joint entropy is equal to the sum of the individual entropies... $$ I(D;O) = h(D)+h(O) - h(D,O), $$ so $$\mbox{if}\quad h(D,O) = h(D)+h(O) \quad \mbox{then} \quad I(D;O) = 0.$$

Now, the joint entropy will be the sum of the individual entropy values when the random variables are independent.

The trivial answer to your question (without more information about your probability mass functions (note: not density functions)) is that if you set $T = D$, then the mutual information $I(D;O)$ is trivially zero because $O$ is fixed and has zero entropy. I don't know how useful you'll find this answer, but yes, technically it is possible to make $I(D;O)$ zero by fixing $T$ to the value of $D$.

Beyond that, I think the question needs some more context.

EDIT

Based on the context you added in your comment below your question, it sounds like $o\in O$ is an observable $\pm t\in T$ and $d\in D$ is a hidden variable, and you're trying to ask can I define $T$ to improve my estimate of $D$?

Think of my answer like this: if you don't know anything about $D$, you aren't going to get very far. If you know everything about $D$, then you can fix $T=D$ and the MI between $O$ and $D$ is $0$. Is there some middle way to specify $T$ to minimize $I(D;O)=I(D;T-D)$? I see you say that we hypothetically know the distribution of $D$. You've only told us the support of $D$ and I think we would also need to know something about the relationship between $O$ and $D$ to answer this. Without more information I can only suggest you start by simulating $D$ and then testing for distributions of $T$ that minimize the mutual info.

Related Question