Solved – Problems in Causality from Judea Pearl Book

causality

I'm starting to read Causal Inference in Statistics, A Primer by Judea Pearl et. al. I have a masters in math, but I have never taken a statistic course. I'm a bit confused by one of the early study questions, and there's no one I can ask about it, so I'm hoping that someone on this site will critique my answers for me. (This is not a homework problem. I'm a retiree, just keeping my mind active.) Note that there are no specific data given in the problems.

a)There are two treatments for kidney stones, Treatment A and
Treatment B. Doctors are more likely to prescribe Treatment A on
large (and therefore more severe) stones and more likely to prescribe
Treatment B on small stones. Should a patient who doesn't know the
size of his or her stone examine the general population data, or the
size-specific data when determining which treatment will be more
effective?

b)There are two doctors in a small town. Each has performed 100
surgeries in his career, which are of two types: one very easy surgery
and one very difficult surgery. The first doctor performs the easy
surgery much more often than the difficult surgery and the second
performs the difficult surgery more often than the easy surgery. You
need surgery, but you don't know if your case is easy or difficult.
Should you consult the success rate of each doctor over all cases, or
should you consult the success rates for the easy and difficult cases
separately, to maximize the chance of a successful surgery?

As to part a) it's reasonable to suppose that there are drawbacks to Treatment A as compared to Treatment B, or why isn't it prescribed all the time? So, it seems to me that I can't make an intelligent decision without knowing the size of my kidney stone. I would expect the data to show Treatment A to be more effective on large stones, and at least as effective on small stones, but I wouldn't want to assume the presumed risks of Treatment A if my stone is small. Assuming that small stones can almost always be treated successfully, I would expect Treatment B to show a higher success rate in the general population, but I wouldn't want to adopt Treatment B if I have a large stone.

It seems to me that the data are useless unless I know the size of my stone. Is this the answer to the question, perhaps? The whole thing seems rather pointless, because I can't go into the pharmacy and buy either treatment over the counter. My doctor will prescribe it, and if he can't (or won't) tell me the size of the stone, I will change doctors.

As for part b) it's clear that you want to look at the rates for the procedures separately, but the rates alone aren't enough. Suppose the first doctor has performed the difficult surgery just once, with a successful outcome, and the second doctor has performed it 37 times, with 35 successes. I would be awfully inclined to go with the second doctor, but I'd want to how 35 out of 37 compares to national norms, and also if the 2 failures occurred early in his career (while he was still learning) or more recently (after he started drinking heavily).

Is this sort of discussion what is called for by the problems, or is a more cut-and-dried answer expected? If I'm lucky enough to have an instructor read this, how would you grade my answer?

Best Answer

First let me say that if I were grading your questions, I would give you an excellent grade. These are the introductory questions of the book, so you still do not have all the tools to think through the problems, but you are already showing that you know you need to take causal information into consideration to answer it.

Now as to the answer, notice the question asks you whether you want to look at the aggregate data or the segregated data. It turns out in both cases you want to look at the segregated data.

In question A, the size of the stone affects both the choice of the treatment and the health status. Hence, you need the segregated data to eliminate this bias and determine which treatment is more effective, either conditionally or unconditionally. To know which treatment is unconditionally better, you need the segregated data to obtain the average causal effect weighted by the probability of stone size $P(Y = 1|do(T)) = \sum_{S}P(Y = 1|T, S)P(S) \neq P(Y =1 |T)$, where $Y$ is health status, $T$ treatment choice and $S$ stone size. If you want to know which treatment is conditionally better, then it's clear you also need the segregated table.

If it's strange to think of picking a treatment for yourself without knowing the stone size, maybe it will be easier to understand the analogous question of having to pick only one treatment to a whole population (say for technical/budget reasons you can't pick both). In this case you want to know which one has the largest average treatment effect on the population as a whole.

Question B is a similar problem, difficulty is a confounder so you need the segregated table to know which doctor is better, both conditionally and unconditionally. Your point about sample size is completely valid, in real life you should always consider sample uncertainty --- but notice it doesn't change the fact that you would still need the information of the segregated data.

Regarding your last comment,

if the 2 failures occurred early in his career (while he was still learning) or more recently (after he started drinking heavily).

It actually touches a deep problem in causal inference, which is the assumption of invariance. Take the case where the doctor started drinking heavily just now. In this case the data before and after that event do not come from the same causal model --- so you would actually need more information and more causal assumptions to make inference in this case.

Related Question