I am a bit confused. Can someone explain to me how to calculate mutual information between two terms based on a term-document matrix with binary term occurrence as weights?
$$
\begin{matrix}
& 'Why' & 'How' & 'When' & 'Where' \\
Document1 & 1 & 1 & 1 & 1 \\
Document2 & 1 & 0 & 1 & 0 \\
Document3 & 1 & 1 & 1 & 0
\end{matrix}
$$
$$I(X;Y)= \sum_{y \in Y} \sum_{x \in X} p(x,y) \log\left(\frac{p(x,y)}{p(x)p(y)} \right)$$
Thank you
Best Answer
How about forming a joint probability table holding the normalized co-occurences in documents. Then you can obtain joint entropy and marginal entropies using the table. Finally, $$I(X,Y) = H(X)+H(Y)-H(X,Y). $$