Entropy of a variable after adding noise

entropyinformation theoryprobability

Suppose I have a random variable $X$ which is normally distributed $\mathcal{N}(0,1)$. The distribution has entropy $\frac{1}{2}(1 + log(2\pi))$, about 1.42 nats or 2 bits of information. So that means when I observe it, I "learn" about 2 bits of information on average (right?).

Suppose then that we have a variable of noise, $Y$ distributed as $\mathcal{N}(0, \sigma^2)$. I can't observe $X$ directly, I can only observe $X+Y$. But I am interested in $X$ and don't care about $Y$. How much information can I learn about $X$ after observing $X+Y$, in entropy terms? I'm struggling to even pose the problem formally–how do I quantify how much is learned about $X$ while ignoring $Y$?

Looking for the answer to this issue and to refine my intuition for thinking about information theory problems.

Best Answer

The concept of mutual information seems to capture exactly what you are looking for:

[Mutual information] quantifies the "amount of information" (in units such as shannons (bits), nats or hartleys) obtained about one random variable by observing the other random variable.

Specifically, you would be looking at $I(X; X+Y)$.


Here I am reading your question

How much information can I learn about $X$ after observing $X+Y$, in entropy terms

as "how much information does observing $X+Y$ gives me about $X$". If you meant "how much information remains to be learnt in $X$" (i.e., how much remaining entropy there is), then you may want to look at the conditional entropy instead (both are very much related).

Related Question