Solved – Explanation of I-map in a Markov/Bayesian network

bayesian networkconditional probabilitygraph theorygraphical-modelmachine learning

I am finding the concept of an I-map (Independency-map) in the context of Markov networks and Bayesian networks difficult to understand. From Probabilistic Graphical Models, Koller and Friedman, 2009:

We first define the set of independencies associated with a
distribution $P$.

Let $P$ be a distribution over $X$. We define $I(P)$ to be the set of
independence assertions of the form $(X \bot Y|Z)$ that hold in $P$.

We can now rewrite the statement that "$P$ satisfies the local
independencies associated with $G$" simply as $I(G) \subseteq I(P)$.
In this case, we say that $G$ is an I-map for $P$.

Where I'm getting confused:

  • Does this represent every possible independence in a graph given every possible subset of the set of variables $Z$? Or are you able to define $I(P)$ because the graph structure is specified?

  • The whole last sentence is confusing to me. Maybe it's because I can't think abstractly enough, but I don't understand how the I-map of G is a subset of the I-map of P.

Any help is appreciated!

Best Answer

From my understanding, if a DAG G is said to be the I-Map of probability distribution P, then every independence we can observe from G is encoded in P. Let's consider a simple example:
Suppose distribution $P_1$ has independence $\{(I\perp D)_p\}$, and distribution $P_2$ has no independence, or $\emptyset$.
Now we define two DAGs: $G$ and $G'$
enter image description here
$G$ is I-Map of $P_1$ because $I$ and $D$ are independent in both $G$ and $P_1$. $G$ is not I-Map of $P_2$ because $P_2$ fails to satisfy the independence between I and D.
(Surprisingly?) $G'$ is I-Map of both $P_1$ and $P_2$ because the independence in $G'$ is $\emptyset$. Since $\emptyset$ is a subset of every set, both $P_1$ and $P_2$ satisfy the independence in $G'$.
Therefore, I-Map, in plain words, means that the independence shown in a DAG is a subset of the independence shown in a distribution.
Hope this helps to clear it up a bit for you.

Related Question