Isn’t strong ignorability an incorrect assumption in complex causal structures

causal-diagramcausalitygraphical-modeltreatment-effect

I have seen that in many papers/competitions for causal inference, the assumption of strong ignorability is made –

$P(Y^{x}\perp X\mid V)$, where $X$ is the treatment, $Y$ the outcome and $V$ indicates the set of all covariates (all other variables).

Example –

This assumption is called the assumption of "no unobserved confounders". But, does it not also make an assumption of the following type of structure (consider $V$ to be just one variable for below figure) –

enter image description here

What if the causal structure is as below (consider $V = \{L,Z\}$)? –

enter image description here

In this case, given $\{L,Z\}$$X$ and $Y$ are not independent. Note, it would be wrong to say such structures do not appear. In data science we can easily face complex causal structures, where conditioning on all variables would open up one or more biasing path(s).

Given this context, the following are my queries –

  1. In the presence of such complex structures, does the condition $P(Y^x \perp X\mid V)$ not fail?
  2. If the strong ignorability condition really fails in the face of such collider containing structures, why is it so rampant in competitions/research-papers? Particularly, I have seen this in social-science, healthcare etc. Is it because such structures are not expected to arise in those fields, because only known confounders are included in the dataset?
  3. The papers/competitions given above propose ideas for good estimators, or benchmarks for testing those. For someone who expects to face such complex structures, should they ignore the results of these resources that make such assumptions? Or should they just replace "all covariates" with "all confounders"? After all, if the causal structure is clearly known, one can always find the respective confounders for the effect of $X$ on $Y$. And then use the best estimators the papers propose/benchmark.

Best Answer

The assumption of strong ignorability is that there exists a set of variables $W$, possibly a subset of all measured variables $V$, such that $Y^X \perp X \mid W$. It does not say that $Y^X \perp X \mid V$, i.e., that the potential outcomes are independent of treatment given all measured variables $V$. To meet this assumption, one has to find the set $W$. In your example, $W$ consists only of $L$ and does not include $Z$. So, strong ignorability is not met when using $W = \{L, Z\}$ but it is met when using $W = \{L\}$. We call $W$ a "sufficient adjustment set". Note that strong ignorability says nothing about how to construct or find $W$ or which sets of variables are allowable to satisfy the assumption. It is merely the assumption that there is a set of such variables. Estimators that rely on strong ignorability (and not all estimators of causal effects do) then use $W$ to estimate the effect.

The theory used to identify which variables form a sufficient adjustment set, i.e., which set of variables to include in $W$ to meet strong ignorability, is DAG theory. DAG theory says not to include $Z$ in $W$ precisely because doing so opens up a backdoor path; conditioning on a collider induces a non-causal association between the antecedents of the collider.

In an ideal observational study, a researcher should identify a set of variables to collect that form a sufficient adjustment set. In practice, researchers often have a dataset with many variables already measured, and they have to decide which ones among them belong in $W$, which do not belong in $W$ (e.g., colliders), and whether there is a set of variables that forms a sufficient adjustment set. My sense is that most studies do this to some extent, which is why you might not have to worry about studies being invalidated by conditioning on colliders. One way to avoid conditioning on colliders is to only condition on variables measured before treatment (i.e., "pre-treatment covariates"); these cannot be caused by either the treatment or the outcome and so are less likely to induce a violation of strong ignorability (though pre-treatment colliders of unmeasured pre-treatment causes of the treatment and outcome can still be a problem; this is called "M-bias").

Related Question