Probability that a number passing the Fermat test is prime

conditional probabilityelementary-number-theoryprimality-testprobability

I'm studying a computer science textbook that has a section on the Fermat test as an example of a probabilistic method.

Given a number $n$, the Fermat test is stated as

  1. pick a random number $a < n$.
  2. If $a^n \equiv a\pmod n$, chances are good that $n$ is prime.
  3. Else, $n$ is certainly not prime.

Excepting the Carmichael numbers, the book goes on to say:

one can prove that, for any $n$, the condition does not hold for most of the integers $a < n$ unless $n$ is prime.

Thus, if $n$ passes the test for some random choice of $a$, the chances are better than even that $n$ is prime. If $n$ passes the test for two random choices of a, the chances are better than 3 out of 4 that $n$ is prime.

By running the test with more and more randomly chosen values of $a$ we can make the probability of error as small as we like.

While I understand that repeating the test increases the probability of $n$ being prime, I do not understand how they arrived at those numbers : better than even – testing once, better than 3 out of 4 – testing twice.


I can see that, for a random choice of $a$, the first statement means $P(\text{passing the test}) \lt 0.5$ when $n$ is composite and equal to $1.0$ otherwise.

How do I calculate the probability that $n$ is prime, given that the test passes $x$ times?

Best Answer

Could the confusion be coming from the textbook's informal use of the word "chance"?

Define these events:

  • $A: n$ is prime

  • $A^c: n$ is composite (i.e. complement of $A$)

  • $T_k:$ passes $k$ tests

Then these are true statements:

  • $P(T_k \mid A) = 1$

  • $P(T_k \mid A^c) \le 2^{-k}$

But using the normal rules of probability, you cannot conclude $P(A \mid T_k) =$ anything at all, much less the textbook's informal claim that $P(A \mid T_k) \ge 1 - 2^{-k}$.

This is a classic case of confusing probability and likelihood. To the textbook authors' credit, they did not use the word "probability" (nor "likelihood") but instead use the word "chance".

Anyway, under the normal rules of probability, you can say nothing(?) about $P(A \mid T)$ unless you have a prior (unconditional) probability $P(A)$. If you had a prior, you can use Bayes Rule:

$$P(A\mid T) = {P(A \cap T) \over P(T)} = {P(A \cap T) \over P(A \cap T) + P(A^c \cap T)} \approx {P(A) \over P(A) + P(A^c)2^{-k}}$$

We can see that, if the prior were $P(A) = P(A^c) = 1/2$ (akin to: "I have no idea if it will rain next Christmas so lets say it's 50-50"), then:

$$P(A \mid T) \approx {\frac12 \over \frac12 + \frac12 \times 2^{-k}} = {1 \over 1 + 2^{-k}} \approx 1 - 2^{-k}$$

However for any other prior, the above would not be true. In particular, if $P(A) = 0$ or $1$, then $P(A \mid T) = 0$ or $1$ (which should be obvious anyway).

This also means something more practical: if you were to pick a random $1000$-digit integer (i.e. $P(A)$ is tiny), and then do a single test, and lets say your test is so poor that $P(T \mid A^c) = 1/2$ in this range of integers, then even if you pass the test, you absolutely should not conclude there is $1/2$ chance you picked a prime... because it is much much more probable that you picked a composite and just got a deceptive test result. (Very similar to when a person gets a Positive disease test result when it is known the disease affects a very tiny fraction of the population... the test result is probably a False Positive.)

So the question is: what is $P(A)$? There is no good answer to that, because there is no (uniform) distribution over all integers. (And if you use density, then the density of primes is $0$.) That's why (IMHO) the textbook authors use the word "chance" and probably (pun intended) meant "likelihood".

Related Question