Causal inference and the backdoor-criterion

backdoor-pathcausal-diagramcausalityd-separationmultiple regression

I am trying to determine if the authors of the following report missed out on an important factor or if i am the one who have missed out on something.
In the following report: Bias Correction For Paid Search In Media Mix Modeling

The authors first state the back-door criteria which is one of the criteria that can be used to determine if estimates are indeed causal.

"(The back-door criterion, Pearl 2013) Given a causal diagram, a set of variables $Z$ satisfies the
back-door criterion relative to an ordered pair of variables $(X, Y)$ in the diagram if: 1) no node in $Z$ is a
descendant of $X$; and 2) $Z$ “blocks” every path between $X$ and $Y$ that contains an arrow into $X$."


Then when considering figure 4.2 from the same report:
enter image description here

The authors states in relation to this graph:

"Remark 5. Note that $(X1, V)$ does not satisfy the back-door criterion for $X2 → Y$, since the path $X2 ←
consumer \ demand → \epsilon_2$
is not blocked. For example, $X2$ may represent social media ad spend. This suggests
that the causal effect of $X2$ on sales cannot be estimated consistently by observations on $(Y, X1, X2, V)$ only."

While this is indeed correct, it leads one to believe that if we could remove $\epsilon_0$ from the outcome we would be able to accurately identify $X2$.

Notice however that $search \ queries \ V$ is a direct descendant of $X2$ thus the backdoor criterion does not hold anyway..$X1$ is also an descendant of $X2$ thus we have two violations of the first criteria in the back-door criterion.


Question: Have the authors missed out on the fact that $search \ queries \ V$ is a descendant of $X2$ or am i missing out on something and secondly, how can this be solved?

Best Answer

Welcome to CV.SE!

$V$ is indeed a descendant of $X_2$, as is $X_1$, which violates the conditions of the back-door criterion. Additionaly, $X_1$ is a collider (i.e $c$ in $x\rightarrow c \leftarrow y$) in some of the paths going through the $\textrm{budget}$ element (e.g. $X_2 \leftarrow \textrm{budget} \rightarrow X_1 \leftarrow V \rightarrow \textrm{organic search} \rightarrow Y$), so it must not be in $Z$.

Pertaining to your second question: if there's a back-door path which cannot be blocked (i.e. d-separated) by any element set, that means that you can't establish a causal relationship by observation alone; you need to set up an experiment where you explictly intervene on some variables and watch what happens with the others.

In the example, one experiment might comprise intervening on the search queries $V$ independently of the non-search contributors $X_2$ and the consumer demand, which would break the parental relationship among them. Then, conditioning on $Z = \{\textrm{consumer demand}, \textrm{budget}\}$ ($X_1$ is a collider in some of the paths, so it can't be in $Z$) would allow you to establish the causal relationship between $X_2$ and $Y$.