Causal Inference – How to Interpret the Consistency Assumption in Mathematical Terms?

bayesiancausalityobservational-studytreatment-effect

In causal inference, the consistency assumption states that there are no multiple versions of treatment. Specifically, for a potential outcome unit $Y_i$ and a binary treatment vector $\mathbf{Z}$,

$$
Y_i(\mathbf{Z})=Y_i(\mathbf{Z'}) \ \ \forall \ \mathbf{Z},\mathbf{Z'}:\mathbf{Z}=\mathbf{Z'}
$$
In literature, it says that "This says that the mechanism used to assign the treatments does not matter and assigning
the treatments in a different way does not constitute a different treatment."

I am wondering how to make sense of this equation. What is it actually trying to say?

Best Answer

Let me use $X$ for the treatment, $Y$ for the observed outcome and $Y(x)$ for the potential outcome under $X = x$.

Consistency means that for an individual $i$, his observed outcome $Y_i$ when $X_i = x$ is his potential outcome $Y_{i}(x)$. Or, more formally:

$$X_i = x \implies Y_i(x) = Y_i$$

When the treatment is binary ($X \in \{0,1\}$) consistency translates to the well known equation:

$$ Y_i = X_i Y_i(1) +(1-X_i)Y_i(0) $$

In an informal way, that's what people intend to convey when stating that the "way" $X$ is assigned doesn't matter, that is, that there aren't multiple versions of the treatment. For if there were multiple potential outcomes to the same $x$, when you see $X_i = x$, then which potential outcome is the observed outcome $Y_i$? And if it doesn't matter how the treatment is assigned, then when $X_i = x$, $Y_i(x)$ is well defined and usually assumed to be equal to $Y_i$ (but note that, even without multiple versions of the treatment, you still need to assume that $X_i = x \implies Y_i(x) = Y_i$).

What would happen without consistency?

Consistency is what connects the potential outcomes with the observed data. That is, it's consistency that allows us writing things like:

$$ E[Y(x)|X = x] = E[Y|X =x] $$

Which transforms expressions of counterfactual quantities $Y(x)$ into expressions of observed quantities $Y$.

Without consistency, all of your potential outcomes data would be "missing". To make this clear, consider again a binary treatment. If consistency holds , when $X=1$ you observe $Y_i = Y(1)$ and when $X = 0$ you observe $Y_i = Y_i(0)$. The other two potential outcomes are unobserved, so your potential outcomes table would look like this:

\begin{array} {|r|r|r|r|} \hline &Y_i(1) & Y_i(0) \\ \hline X_i=1 &Y_i &unobserved\\ \hline X_i=0 & unobserved & Y_i\\ \hline \end{array}

If consistency did not hold, this is what you would get --- you don't observe any potential outcome:

\begin{array} {|r|r|r|r|} \hline &Y_i(1) & Y_i(0) \\ \hline X_i=1 &unobserved &unobserved\\ \hline X_i=0 & unobserved & unobserved\\ \hline \end{array}

Consistency, potential outcomes,axioms of counterfactuals and structural causal models

In some of the potential outcomes literature, consistency is not explicitly defined, and casually considered together with other substantive assumptions such as SUTVA. In the axiomatization of counterfactuals, consistency is seen as a corollary of the axiom of composition. However, once you properly define a structural causal model (SCM) and you define counterfactuals as derived from interventional submodels, consistency is simply a natural consequence that automatically holds for all SCMs (see Gales and Pearl and also Pearl's Causality, Chapter 7).

Finally, whether consistency really "holds" in your model when compared to the real world is a practical/modeling issue. That is, to make any inference, you always need consistency, so what you are question is not the rule, but modeling assumptions. For example, do you think you have properly defined the treatment assignment $X$? Or are there more relevant things in the assignment of $X$ that would matter for the outcome that you didn't model? These are the questions you need to think about to judge whether your model is a good approximation for the problem you are investigating.

Related Question