Solved – How to resolve Simpson’s paradox

simpsons-paradox

Simpson's paradox is a classic puzzle discussed in introductory statistics courses worldwide. However, my course was content to simply note that a problem existed and did not provide a solution. I would like to know how to resolve the paradox. That is, when confronted with a Simpson's paradox, where two different choices seem to compete for the being the best choice depending on how the data is partitioned, which choice should one choose?

To make the problem concrete, let's consider the first example given in the relevant Wikipedia article. It is based on a real study about a treatment for kidney stones.

enter image description here

Suppose I am a doctor and a test reveals that a patient has kidney stones. Using only the information provided in the table, I would like to determine whether I should adopt treatment A or treatment B. It seems that if I know the size of the stone, then we should prefer treatment A. But if we do not, then we should prefer treatment B.

But consider another plausible way to arrive at an answer. If the stone is large, we should choose A, and if it is small, we should again choose A. So even if we do not know the size of the stone, by the method of cases, we see that we should prefer A. This contradicts our earlier reasoning.

So: A patient walks into my office. A test reveals they have kidney stones but gives me no information about their size. Which treatment do I recommend? Is there any accepted resolution to this problem?

Wikipedia hints at a resolution using "causal Bayesian networks" and a "back-door" test, but I have no clue what these are.

Best Answer

In your question, you state that you don't know what "causal Bayesian networks" and "back door tests" are.

Suppose you have a causal Bayesian network. That is, a directed acyclic graph whose nodes represent propositions and whose directed edges represent potential causal relationships. You may have many such networks for each of your hypotheses. There are three ways to make a compelling argument about the strength or existence of an edge $A \stackrel?\rightarrow B$.

The easiest way is an intervention. This is what the other answers are suggesting when they say that "proper randomization" will fix the problem. You randomly force $A$ to have different values and you measure $B$. If you can do that, you're done, but you can't always do that. In your example, it may be unethical to give people ineffective treatments to deadly diseases, or they may be have some say in their treatment, e.g., they may choose the less harsh (treatment B) when their kidney stones are small and less painful.

The second way is the front door method. You want to show that $A$ acts on $B$ via $C$, i.e., $A\rightarrow C \rightarrow B$. If you assume that $C$ is potentially caused by $A$ but has no other causes, and you can measure that $C$ is correlated with $A$, and $B$ is correlated with $C$, then you can conclude evidence must be flowing via $C$. The original example: $A$ is smoking, $B$ is cancer, $C$ is tar accumulation. Tar can only come from smoking, and it correlates with both smoking and cancer. Therefore, smoking causes cancer via tar (though there could be other causal paths that mitigate this effect).

The third way is the back door method. You want to show that $A$ and $B$ aren't correlated because of a "back door", e.g. common cause, i.e., $A \leftarrow D \rightarrow B$. Since you have assumed a causal model, you merely need to block the all of the paths (by observing variables and conditioning on them) that evidence can flow up from $A$ and down to $B$. It's a bit tricky to block these paths, but Pearl gives a clear algorithm that lets you know which variables you have to observe to block these paths.

gung is right that with good randomization, confounders won't matter. Since we're assuming that intervening at the the hypothetical cause (treatment) is not allowed, any common cause between the hypothetical cause (treatment) and effect (survival), such as age or kidney stone size will be a confounder. The solution is to take the right measurements to block all of the back doors. For further reading see:

Pearl, Judea. "Causal diagrams for empirical research." Biometrika 82.4 (1995): 669-688.


To apply this to your problem, let us first draw the causal graph. (Treatment-preceding) kidney stone size $X$ and treatment type $Y$ are both causes of success $Z$. $X$ may be a cause of $Y$ if other doctors are assigning tratment based on kidney stone size. Clearly there are no other causal relationships between $X$,$Y$, and $Z$. $Y$ comes after $X$ so it cannot be its cause. Similarly $Z$ comes after $X$ and $Y$.

Since $X$ is a common cause, it should be measured. It is up to the experimenter to determine the universe of variables and potential causal relationships. For every experiment, the experimenter measures the necessary "back door variables" and then calculates the marginal probability distribution of treatment success for each configuration of variables. For a new patient, you measure the variables and follow the treatment indicated by the marginal distribution. If you can't measure everything or you don't have a lot of data but know something about the architecture of the relationships, you can do "belief propagation" (Bayesian inference) on the network.