A “directed path” in context of causal graphs

causal-diagramcausalitydaggraph theorygraphical-model

I am going through Causal Inference In Statistics by Pearl and I have come across the definition of path and directed path (section 1.4, page 25).

Path: A path between two nodes $X$ and $Y$ is a sequence of nodes
beginning with $X$ and ending with $Y$, in which each node is
connected to the next by an edge.

Directed path: A path between two nodes is a directed path if it can
be traced along the arrows, that is, if no node on the path has two
edges on the path directed into it, or two edges directed out of it
.

I have confusion understanding the phrasing of the highlighted portion in the definition of "directed path". This clarity becomes important in the definition of variables satisfying front-door criterion (definition 3.4.1, section 3.4, page 69) –

A set of variables $Z$ is said to satisfy the front-door criterion
relative to an ordered pair of variables $(X,Y)$ if

  1. $Z$ intercepts all directed paths from $X$ to $Y$.
  2. There is no backdoor path from $X$ to $Z$.
  3. All backdoor paths from $Z$ to $Y$ are blocked by $X$.

Clarification for the doubt can be provided via counterexamples as well. For example, does $X \rightarrow Z \rightarrow Y$ in the below figure qualify as a directed path, or does it violate the "two edges directed in/out" criterion?
enter image description here

Note: I could not refer to the Wikipedia definition, or other sources, as, I feel, there is no consensus about the definition of directed path across all literature, and is context dependent.

Best Answer

I haven't read Pearl's book, but have learned my share of graph theory.

A path between two nodes is a directed path if it can be traced along the arrows, that is, if no node on the path has two edges on the path directed into it, or two edges directed out of it.

The "that is" is inconsistent. A path that "can be traced along the arrows" would include loops, "if no node on the path has two edges on the path directed into it, or two edges directed out of it" excludes loops. These are two different things.

In the example, $X\to Z\to Y$ would be a directed path under both definitions.

However, assume there was an additional arrow from $Z$ back to $X$. Then $X\to Z\to X\to Z\to Y$ would be a directed path under the first definition ("it can be traced along the arrows"), but not under the second ("no node on the path has two edges on the path directed into it, or two edges directed out of it": $Z$ has two edges on the path directed out).

So the answer is clear for the example you are given. For the rest of the book, you should be a little careful if you encounter DAGs that contain potential loops. I would assume that Pearl really meant the second definition: a "directed path" for him probably is a "loopless directed path".

Related Question