Independence – Does Statistical Independence Mean Lack of Causation?

bayesian networkcausalitydagindependence

Two random variables A and B are statistically independent. That means that in the DAG of the process: $(A {\perp\!\!\!\perp} B)$ and of course $P(A|B)=P(A)$. But does that also mean that there's no front-door from B to A?

Because then we should get $P(A|do(B))=P(A)$. So if that's the case, does statistical independence automatically mean lack of causation?

Best Answer

So if that's the case, does statistical independence automatically mean lack of causation?

No, and here's a simple counter example with a multivariate normal,

set.seed(100)
n <- 1e6
a <- 0.2
b <- 0.1
c <- 0.5
z <- rnorm(n)
x <- a*z + sqrt(1-a^2)*rnorm(n)
y <- b*x - c*z + sqrt(1- b^2 - c^2 +2*a*b*c)*rnorm(n)
cor(x, y)

With corresponding graph,

enter image description here

Here we have that $x$ and $y$ are marginally independent (in the multivariate normal case, zero correlation implies independence). This happens because the backdoor path via $z$ exactly cancels out the direct path from $x$ to $y$, that is, $cov(x,y) = b - a*c = 0.1 - 0.1 = 0$. Thus $E[Y|X =x] =E[Y] =0$. Yet, $x$ directly causes $y$, and we have that $E[Y|do(X= x)] = bx$, which is different from $E[Y]=0$.

Associations, interventions and counterfactuals

I think it's important to make some clarifications here regarding associations, interventions and counterfactuals.

Causal models entail statements about the behavior of the system: (i) under passive observations, (ii) under interventions, as well as (iii) counterfactuals. And independence on one level does not necessarily translate to the other.

As the example above shows, we can have no association between $X$ and $Y$, that is, $P(Y|X) = P(Y)$, and still be the case that manipulations on $X$ changes the distribution of $Y$, that is, $P(Y|do(x)) \neq P(Y)$.

Now, we can go one step further. We can have causal models where intervening on $X$ does not change the population distribution of $Y$, but that does not mean lack of counterfactual causation! That is, even though $P(Y|do(x)) = P(Y)$, for every individual their outcome $Y$ would have been different had you changed his $X$. This is precisely the case described by user20160, as well as in my previous answer here.

These three levels make a hierarchy of causal inference tasks, in terms of the information needed to answer queries on each of them.