Solved – mutual information vs normalized mutual information

correlationmutual information

I would like to know why some paper uses Normalized Mutual Information and not standard Mutual Information to measure correlation between features? what is the difference between these two measures?

Best Answer

  • Mutual Information I(X,Y) yelds values from $0$ (no mutual information - variables X and Y are independent) to $+\infty$. The higher the I(X,Y), the more information is shared between X and Y. However, high values of mutual information might be unintuitive and hard to interpret due to its unbounded range of values $I(X,Y)\in [0...\infty)$.
  • Normalized Mutual Information measures try to bring the possible values to bounded range $I(X,Y)\in [0...m]$. Specifically, case of $m=1$ is useful due to ease of comparison with commonly used correlation coefficients.

Nice discussion of relations between Mutual Information and Pearson Correlation Coefficient can be found in Materials and Methods section of "Generalized Correlation for Biomolecular Dynamics" paper by Lange and Grubmuller[1]. They also introducte of generalized correlation coefficient that maps values of I(X,Y) onto [0,1] interval, which can be seen as another approach Normalised Mutual Information.

[1] O. F. Lange, H. Grubmüller, Proteins 2006, 62, 1053–1061.