Causality – Can Controlling for a Variable Block Backdoor Path Opened by Controlling for a Collider?

causal-diagramcausalitydag

I have made the following model in DAGitty:

enter image description here

Where X2 is controlled for.

DAGitty says:

The total effect cannot be estimated due to adjustment for an intermediate or a descendant of an intermediate.

My understanding is that I can estimate the total effect if I control also for X1, other than X2, as that would block the backdoor path opened by controlling for X2.

Question: Is it possible to find the treatment effect after controlling for X2, e.g. by closing the backdoor path by further controlling for X1?

Best Answer

The short answer is no - controlling for x1 in this case may mitigate bias, but does not eliminate it: the problem arises because x2 is directly caused by the outcome. You can see this in a simple simulation (Stata code)

clear 
set obs 1000000

gen t = rnormal()
gen x1 = rnormal() + .5*t
gen y = rnormal() + .5*t
gen x2 = rnormal() +.5*x1 + .5*y 

* .5 is the true effect of t on y 
reg y t 

* condition on x2 ~ estimate t -> y =~ .33 
reg y t x2 

* condition additionally on x1 ~ estimate t -> y =~ .4
reg y t x2 x1

Incidentally, Pearl discusses a similar case, where x2, instead of being caused by y, shares a common cause with y. In that case, controlling for x1 does enable you to obtain the correct treatment effect. See discussion of model 15 (and similar cases in models 14-17) in: Cinelli, Forney, and Pearl (2022) A crash course in good and bad controls, Sociological Methods and Research

Having said that, it may still be possible to obtain unbiased treatment effect estimates through the use of inverse probability weighting or similar. In the current case, let's imagine that x2 is a binary dropout indicator (0 = remained in study). In this case we can weight our regression with weights = 1/p(x2 = 1 | x1, y), which you can obtain from a logistic regression of x2 on y and x1 or similar. Note that this depends on having fully observed data for y and x1 regardless of the value of x2 - if you don't observe anything when x2 == 1 then the solutions are considerably more complex or may be infeasible. See the following simulation for a demonstration:

* Alternative with binary x2

clear 
set obs 1000000

gen t = rnormal()
gen x1 = rnormal() + .5*t
gen y = rnormal() + .5*t
gen x2 = rbinomial(1,invlogit(x1 + y))  

* .5 is the true effect of t on y 
reg y t 

* compute inverse probability weights for x2
logit x2 y x1
predict px2, pr
gen ipwx2 = 1/px2

* Unweighted regression is biased as before
reg y t x2 

* weighted regressions with x2 as a covariate, adjusting for x1 doesn't matter either way
reg y t x2 [pweight = ipwx2]
reg y t x2 x1 [pweight = ipwx2]

* unweighted/weighted regressions with only x2 == 0 (i.e. if x2 is dropout)
* unweighted regression is biased
reg y t if x2 == 0
* weighted regression ~ unbiased
reg y t if x2 == 0 [pweight = ipwx2]
Related Question