Can mutual entropy be higher than joint entropy

entropyinformation theory

Let's assume I have three probability distributions A,B,C.

The entropy of each is 1.58, with joint entropy 1.58.

Calculating mutual entropy with formula I(A,B,C) = H(A)+H(B)+H(C)-J(A,B,C) results in 3.16.

How could this be explained? This means there is more common information than there is a capacity of a channel. Could you explain this?

Best Answer

The mutual information $I(X;Y)$ (not "mutual entropy"; also, notice the semicolons, don't confuse them with commas) relates always two (perhaps multivariate) random variables.

$I(A;B;C)$ is a highly dubious extension of the concept of the "real" mutual information, which is not very meaningful, nor much used. For one thing, as you noted, it can be negative. See also here.