Solved – Causal Markov condition simple explanation

causalityconditional probability

I am trying to explain in simple words the Causal Markov condition to establish probabilistic causation. The definition from Hausman and Woodward (1999) is the following:

Let G be a causal graph with vertex set V and P be a probability distribution
over the vertices in V generated by the causal structure represented by G. G
and P satisfy the Causal Markov Condition if and only if for every X in V, X
is independent of V\(Descendants(X) ∪ Parents(X)) given Parents(X)

My explanation is that a Causal Markov condition is satisfied if the set of variables in a causal relationship with given probability distributions are independent of all the other variables unless they are their parents or their descendants. This is slightly different than other definitions around so, is my explanation first, correct?, second clear?
Any advice will be appreciated.

Best Answer

One way to think about the Causal Markov Condition (CMC) is giving a rule for "screening off": once you know the values of $X$'s parents, all other variables in $V$ become irrelevant for predicting $X$, except for $X$'s descendants.

I find examples make the CMC easiest to understand. I did a quick google image search for "mechanism of cardiovascular disease" so I can give you a medical example. Take this graph (let's call it $G$):

mechanism of cardiovascular disease. Source: Nat Clin Pract Cardiovasc Med (2008) Nature Publishing Group

Say you have a probability distribution $P$ over the variables in $G$. If the CMC holds in $P$ (relative to $G$), then you can infer that:

  • If I know the patient's amount of Oxidative stress and inflammation, then learning the patient's degree of Plaque progression won't give me any extra information about the patient's Platelets.
  • If I know the patient's amount of Atheroma, then learning the amount of Oxidative stress and inflammation won't help me predict Plaque progression.

However, the CMC allows the following possibilities:

  • If I know the patient's amount of Atheroma, then learning the patient's degree of Plaque rupture might still tell me something more about Plaque progression. (I'm learning from a descendant variable.)
  • If I don't know the patient's amount of Oxidative stress and inflammation, then learning the patient's degree of Plaque progression might well give me some extra information that helps me predict the patient's Platelets. (I'm not conditioning on the parents.)
Related Question