Adding Background Evidence Variable to Bayes’ Theorem

bayes-theoremconditional probabilitygeometryprobabilitystatistics

A geometric interpretation of the Bayes' Theorem formula

$$\Pr(H \mid E) = \frac{\Pr(H)\ \Pr(E\mid H)}{\Pr(H)\ \Pr(E\mid H)\ +\ \Pr(H^c)\ \Pr(E\mid H^c)}$$

takes place in a unit square. First, draw a vertical line through the square so that your assessment of a hypothesis's prior probability, $\Pr(H)$, is the distance left of the line, and $\Pr(H^c)$ is the distance right of the line, as shown below in the left figure. Then draw horizontal lines through the left rectangle at height $\Pr(E\mid H)$, and the right rectangle at height $\Pr(E\mid H^c)$, as in the center figure. If the left line is higher, then the new evidence increased your epistemic probability of $H$, and vice-versa. The areas of the new rectangles correspond to the above Bayes' Theorem formula, according to the pattern displayed in the right figure. The numeric output is your new epistemic probability of $H$.

enter image description here

This output can then be treated as a new prior probability to be assessed against a new piece of evidence. If the original was your prior probability at time $t_0$, now your prior probability at $t_1$ is visualized by drawing a vertical line in a fresh unit square, as far to the right as necessary to match the output of the previous Bayes' Theorem iteration. This process can continue indefinitely, assigning a new time step for each incoming piece of evidence.

An alternative Bayes' Theorem formula

$$\Pr(H\mid (B\ \cap\ E)) = \frac{\Pr(H\mid B)\ \Pr(E\mid (B\ \cap\ H))}{\Pr(H\mid B)\ \Pr(E\mid (B\ \cap\ H)) + \Pr(H^c\mid B)\ \Pr(E\mid (B\ \cap\ H^c))}$$

includes the variable $B$ to represent background evidence. This answer explains how to algebraically derive the alternative formula from the initial formula. What I don't understand is $B$'s purpose. It seems to me that the initial Bayes's Theorem formula already described the iterable progression of the epistemic probability of $H$ from one time step to the next. The notion of background evidence must have already been accounted for implicitly in $Pr(H)$.

How. if at all, does switching to the Bayes' Theorem formula with $B$ change the conceptual and/or geometric evolution described above? Is the inclusion of $B$ a mathematically irrelevant substitution, similar to how differential equations are sometimes written with constants such as $\frac{k}{m}$, even when they could be simplified by lumping, or should the presence of $B$ add something to the way the above mathematics is understood?

Best Answer

The difference is that $B$ and $E$ cannot be “lumped” together, because they do not always occur together on both sides of the second equation. Notice the occurrence of $$P(H|\color{green}{(B\cup E)})$$ on the LHS, and $$P(E|\color{green}{(B\cup H)})$$ in the numerator of the RHS.

The usefulness of Bayes’ Theorem is that it allows you to calculate $P(X|Y)$ in terms of $P(Y|X)$, i.e. the probability of $X$ given $Y$ in terms of the probability of $Y$ given $X$. Basically, it allows you to “switch” which event is taken as a given. However, in the second equation, the “background information” $B$ is always taken as a given, whereas $H$ and $E$ both may or may not be taken as a given.

Does that make sense?

Related Question