Solved – What does P(A|B)*P(A|C) simplify to

conditional probability

Let's say we have a problem of predicting whether a storm is coming or not. We have a model P(storm is coming | how many clouds are outside), and have another model P(storm is coming | how scared the dogs are).

My question in general is how to combine these two models in a sensible way. It seems to make sense if I just multiply the two together – however, I don't know what this expression evaluates to:

P(storm is coming | how many clouds are outside) * P(storm is coming | how scared the dogs are) = ?

Or, with simpler notation:

P(A|B)P(A|C) = ? (where B and C are independent)

I have a feeling the answer is P(A|B)P(A|C) = P(A|B,C) but I can't prove it. (As many pointed out, this is wrong)

Edit: Sorry for the confusion, the question here is "what is P(A|B)P(A|C) = ?"

Best Answer

Let's say we have a problem of predicting whether a storm is coming or not.

So we'd like to predict whether a storm is coming or not (event $A$), and we have some clues available to us, namely the amount of clouds in the sky (event $B$) and how scared your dogs are (event $C$).

We can visualise the problem at hand using a Venn diagram:

Venn diagram of events "storm" (A), "cloudy sky" (B) and "scared dogs" (C)

We are interested in calculating the probability of a storm given the clues, $P(A|B,C)$. That quantity isn't represented directly in the diagram; instead, we can get $P(A,B,C)$ (a.k.a $P(A \cap B \cap C)$) from the white central area in the diagram. Fortunately, the relationship between $P(A|B,C)$ and $P(A,B,C)$ is simple:

$$P(A,B,C) = P(A|B,C) \cdot P(B,C)$$

where $P(B,C)$ corresponds to the conjunction between the magenta and white areas of the diagram.

We have a model P(storm is coming | how many clouds are outside), and have another model P(storm is coming | how scared the dogs are)

So we know $P(A|B)$ and $P(A|C)$. Like before, these two quantities are not represented directly in the diagram. Instead, we have $P(A,B)$, which corresponds to the yellow and white areas, and $P(A,C)$, which corresponds to the cyan and white areas. As before, we know the relationship between $P(A,B)$ and $P(A|B)$:

$$P(A,B) = P(A|B) \cdot P(B)$$

Same goes for $P(A,C)$ and $P(A|C)$.

To recap, we would like to know $P(A|B,C)$, which is related to the white area in the Venn diagram. So what happens if we add $P(A)$, $P(B)$ and $P(C)$? We are counting the magenta, yellow and cyan areas twice each, and the white central area three times. So we subtract the magenta, yellow and cyan areas once:

$$P(A) + P(B) + P(C) - P(A,B) - P(A,C) - P(B,C)$$

Except now we removed the white area from the summation; we added the white area three times when we summed up $A$, $B$, and $C$, but we removed it three times when we subtracted $(A,B)$, $(A,C)$ and $(B,C)$. So we add it back:

$$P(A) + P(B) + P(C) - P(A,B) - P(A,C) - P(B,C) + P(A,B,C)$$

We didn't account for the area outside all the circles, which corresponds to $P(\tilde{} A, \tilde{} B, \tilde{} C)$, which is the chance that there is no storm AND there are no clouds AND the dogs aren't scared.

$$P(A) + P(B) + P(C) - P(A,B) - P(A,C) - P(B,C) + P(A,B,C) + P(\tilde{} A, \tilde{} B, \tilde{} C) = 1$$

Let's assume that a storm ocurring with a spotless sky is very unlikely; $P(\tilde{} A, \tilde{} B, \tilde{} C) \approx 0$. In that case,

$$P(A) + P(B) + P(C) - P(A,B) - P(A,C) - P(B,C) + P(A,B,C) = 1$$

Let's apply the transformations we saw before:

$\begin{align} P(A|B,C) \cdot P(B,C) &= 1 - [P(A) + P(B) + P(C) - P(A|B) \cdot P(B) - P(A|C) \cdot P(C) - P(B,C)]\\ P(A|B,C) &= \dfrac{1 - P(A) - P(B) - P(C) + P(A|B) \cdot P(B) + P(A|C) \cdot P(C)}{P(B,C)} + 1 \end{align} $

As you can see, you would need more information if you want to calculate the probability of a storm given your clues. Namely:

  1. The probability of a storm in general;
  2. The probability of a cloudy sky in general;
  3. The probability of your dogs being scared in general; and
  4. The probability that your dogs will be scared AND the sky will be cloudy.

If you think about it, numbers 1-3 make sense:

  1. The clues may increase the probability of a storm, but if there aren't many storms to begin with, then the probability of a storm given your clues will still be small (albeit larger than the baseline probability of a storm);
  2. If you live in a typically cloudy area, the amount of clouds in the sky will probably be a poor predictor of a storm (because it's always cloudy, storm or no storm);
  3. Ditto for your dogs being scared.

Number 4 is a bit trickier. If either your dogs or the sky (or both) are perfect predictors of a storm, then there is no need for the other.

Now all this math assumes that your model outputs $P(\mathrm{storm} | \mathrm{clouds})$ ($P(A | B)$) and $P(\mathrm{storm} | \mathrm{scared\ dogs})$ ($P(A|C)$). However, it is typically easier to observe $P(\mathrm{clouds} | \mathrm{storm})$ ($P(B | A)$) and $P(\mathrm{scared\ dogs} | \mathrm{storm})$ ($P(C|A)$). In that case, we must note that

$$P(A,B) = P(A|B) \cdot P(B) = P(B|A) \cdot P(A)$$

so our previous model becomes

$$P(A|B,C) = \dfrac{1 - P(A) - P(B) - P(C) + P(B|A) \cdot P(A) + P(C|A) \cdot P(A)}{P(B,C)} + 1$$

Related Question