Another extension of mutual information to multiple variables

entropyinformation theoryprobabilitystatistics

The mutual information can be expressed as
$$
I(X;Y) = H(X) + H(Y) – H(X, Y)
$$

And now I encounter the following expression, which seems to be an extension of mutual information:
$$
F(X_1;\cdots;X_N) = \sum_{i=1}^N H(X_i) – H(X_1, X_2, \cdots, X_N)
$$

However, I know the definition of multivariate mutual information, where
$$
I(X_1;\cdots;X_N) = -\sum _{T\subseteq \{X_1,\ldots ,X_N\}}(-1)^{|T|}H(T)
$$

I have two questions,

  1. Why $I$ is the commonly used extension of mutual information, rather than $F$?
    From my view, $F$ is more like an "information gain" because $F$ is non-negative. And $I$ can be positive and negative, confusing me what it stands for.

  2. Is there any interpretation of $F$? Or any studies about properties?

I didn't have much knowledge about information theory, and I appreciate any kindly help.

Best Answer

Both formulas are separate concepts that already exist. They were independently conceived to generalize pair-wise mutual information to more than $2$ random variables.

$F(X_1;\cdots;X_N) = \sum_{i=1}^N H(X_i) - H(X_1, X_2, \cdots, X_N)$ is multiinformation, or total correlation. It is non-negative and quantifies the redundancy or dependency among a set of random variables.

$I(X_1;\cdots;X_N) = -\sum _{T\subseteq \{X_1,\ldots ,X_N\}}(-1)^{|T|}H(T) $ is a decomposition of multivariate mutual information (MMI), which was introduced following a proof of the possible negativity of mutual-information for degrees higher than 2. A negative MMI can be interpreted as syngergistic entropy (as opposed to redundant entropy), but is hard to interpret in some applications.