Causal discovery with structured unobserved confounders and proxy variables

causality

I face a problem of causal discovery with latent confounders.

Unlike latent variables dealt in FCI (the Fast Causal Inference algorithm), the latent confounders in my problem are known to have some unique structures.

Specifically, we know that all the latent confounders ($U$) are children (or parent) of some observed variables ($P$).

For example:

$P \rightarrow U$, $U \rightarrow X$, $U \rightarrow Y$

where $X$ and $Y$ are two observed variables, and I am interested in whether their correlation is due to the latent confounder $U$ or causation, i.e. $X \rightarrow Y$ or $Y \rightarrow X$. $U$ is not observed, however, we know $U$ is the child (or parent) of an observed variable $P$.

What I am thinking is that, If the correlation between $X$ and $Y$ is due to $U$, then conditioning on the parent of $U$, i.e. $P$ may weaken such a correlation. This can help us discriminate whether $X,Y$ correlation is due to the latent confounder $U$ or causation.

I don't know whether my intuitive idea is true or not. Is my problem related with proxy variables? Because I want to have a comprehensive understanding, please cite related papers if you know.

Best Answer

I did some paper reviewing and found the answer to most of my questions. I post my findings here such that it may help others with similar confusion.

If we have a causal diagram like:

$U \rightarrow X$, $U \rightarrow Y$, $U \rightarrow L$

where $U$ is an unobserved confounder between $X$ and $Y$. $L$ is an observed child of $U$.

This can be formulated as a mismeasured confounders problems, where $L$ is a surrogate of $U$ with measurement error. See Chap-9.3 of Hernan, Robins [Causal inference: what if] for detailed descriptions.

This can also be formulated as a causal effect identification problems with a proxy variable $L$ of an unmeasured confounder $U$. See this paper and references within for detailed discussions.