[Math] Is Bayes’ Theorem really that interesting

bayes-theoremmotivationprobabilitystatistics

I have trouble understanding the massive importance that is afforded to Bayes' theorem in undergraduate courses in probability and popular science.

From the purely mathematical point of view, I think it would be uncontroversial to say that Bayes' theorem does not amount to a particularly sophisticated result. Indeed, the relation
$$P(A|B)=\frac{P(A\cap B)}{P(B)}=\frac{P(B\cap A)P(A)}{P(B)P(A)}=\frac{P(B|A)P(A)}{P(B)}$$
is a one line proof that follows from expanding both sides directly from the definition of conditional probability. Thus, I expect that what people find interesting about Bayes' theorem has to do with its practical applications or implications. However, even in those cases I find the typical examples being used as a justification of this to be a bit artificial.


To illustrate this, the classical application of Bayes' theorem usually goes something like this: Suppose that

  1. 1% of women have breast cancer;
  2. 80% of mammograms are positive when breast cancer is present; and
  3. 10% of mammograms are positive when breast cancer is not present.

If a woman has a positive mammogram, then what is the probability that she has breast cancer?

I understand that Bayes' theorem allows to compute the desired probability with the given information, and that this probability is counterintuitively low. However, I can't help but feel that the premise of this question is wholly artificial. The only reason why we need to use Bayes' theorem here is that the full information with which the other probabilities (i.e., 1% have cancer, 80% true positive, etc.) have been computed is not provided to us. If we have access to the sample data with which these probabilities were computed, then we can directly find
$$P(\text{cancer}|\text{positive test})=\frac{\text{number of women with cancer and positive test}}{\text{number of women with positive test}}.$$
In mathematical terms, if you know how to compute $P(B|A)$, $P(A)$, and $P(B)$, then this means that you know how to compute $P(A\cap B)$ and $P(B)$, in which case you already have your answer.


From the above arguments, it seems to me that Bayes' theorem is essentially only useful for the following reasons:

  1. In an adversarial context, i.e., someone who has access to the data only tells you about $P(B|A)$ when $P(A|B)$ is actually the quantity that is relevant to your interests, hoping that you will get confused and will not notice.
  2. An opportunity to dispel the confusion between $P(A|B)$ and $P(B|A)$ with concrete examples, and to explain that these are very different when the ratio between $P(A)$ and $P(B)$ deviates significantly from one.

Am I missing something big about the usefulness of Bayes' theorem? In light of point 2., especially, I don't understand why Bayes' theorem stands out so much compared to, say, the Borel-Kolmogorov paradox, or the "paradox" that $P[X=x]=0$ when $X$ is a continuous random variable, etc.

Best Answer

You are mistaken in thinking that what you perceive as "the massive importance that is afforded to Bayes' theorem in undergraduate courses in probability and popular science" is really "the massive importance that is afforded to Bayes' theorem in undergraduate courses in probability and popular science." But it's probably not your fault: This usually doesn't get explained very well.

What is the probability of a Caucasian American having brown eyes? What does that question mean? By one interpretation, commonly called the frequentist interpretation of probability, it asks merely for the proportion persons having brown eyes among Caucasian Americans.

What is the probability that there was life on Mars two billion years ago? What does that question mean? It has no answer according to the frequentist interpretation. "The probability of life on Mars two billion years ago is $0.54$" is taken to be meaningless because one cannot say it happened in $54\%$ of all instances. But the Bayesian, as opposed to frequentist, interpretation of probability works with this sort of thing.

The Bayesian interpretation applied to statistical inference is immune to various pathologies afflicting that field.

Possibly you have seen that some people attach massive importance to the Bayesian interpretation of probability and mistakenly thought it was merely massive importance attached to Bayes's theorem. People who do consider Bayesianism important seldom explain this very clearly, primarily because that sort of exposition is not what they care about.

Related Question