FPR vs FDR – False Positive Rate vs False Discovery Rate

confusion matrixfalse positive ratefalse-discovery-ratehypothesis testingtype-i-and-ii-errors

The following quote comes from the famous research paper Statistical significance for genome wide studies by Storey & Tibshirani (2003):

For example, a false positive rate of 5% means that on average 5% of
the truly null features in the study will be called significant. A FDR (False Discovery rate) of 5% means that among all features called significant, 5% of these
are truly null on average.

Can somebody explain what that means using a simple numerical or visual example? I am having hard time understanding what it means. I've found various posts on FDR or FPR alone, but haven't found any where a specific comparison was made.

It would be especially good if someone expert in this area could illustrate situations where one is better than the other, or both are good or bad.

Best Answer

I'm going to explain these in a few different ways because it helped me understand it.

Let's take a specific example. You are doing a test for a disease on a group of people. Now let's define some terms. For each of the following, I am referring to an individual who has been tested:

True positive (TP): Has the disease, identified as having the disease

False positive (FP): Does not have the disease, identified as having the disease

True negative (TN): Does not have the disease, identified as not having the disease

False negative (FN): Has the disease, identified as not having the disease

Visually, this is typically shown using the confusion matrix:

enter image description here

The false positive rate (FPR) is the number of people who do not have the disease but are identified as having the disease (all FPs), divided by the total number of people who do not have the disease (includes all FPs and TNs).

$$ FPR = \frac{FP}{FP + TN} $$

The false discovery rate (FDR) is the number of people who do not have the disease but are identified as having the disease (all FPs), divided by the total number of people who are identified as having the disease (includes all FPs and TPs).

$$ FDR = \frac{FP}{FP + TP} $$


So, the difference is in the denominator i.e. what are you comparing the number of false positives to?

The FPR is telling you the proportion of all the people who do not have the disease who will be identified as having the disease.

The FDR is telling you the proportion of all the people identified as having the disease who do not have the disease.

Both are therefore useful, distinct measures of failure. Depending on the situation and the proportions of TPs, FPs, TNs and FNs, you may care more about one that the other.


Let's now put some numbers to this. You have measured 100 people for the disease and you get the following:

True positives (TPs): 12

False positives (FPs): 4

True negatives (TNs): 76

False negatives (FNs): 8

To show this using the confusion matrix:

enter image description here

Then,

$$ FPR = \frac{FP}{FP + TN} = \frac{4}{4 + 76} = \frac{4}{80} = 0.05 = 5\% $$

$$ FDR = \frac{FP}{FP + TP} = \frac{4}{4 + 12} = \frac{4}{16} = 0.25 = 25\% $$

In other words,

The FPR tells you that 5% of people of people who did not have the disease were identified as having the disease. The FDR tells you that 25% of people who were identified as having the disease actually did not have the disease.


EDIT based on @amoeba's comment (also the numbers in the example above):

Why is the distinction so important? In the paper you link to, Storey & Tibhshirani are pointing out that there was a strong focus on the FPR (or type I error rate) in genomewide studies, and that this was leading people to make flawed inferences. This is because once you find $n$ significant results by fixing the FPR, you really, really need to consider how many of your significant results are incorrect. In the above example, 25% of the 'significant results' would have been wrong!

[Side note: Wikipedia points out that though the FPR is mathematically equivalent to the type I error rate, it is considered conceptually distinct because one is typically set a priori while the other is typically used to measure the performance of a test afterwards. This is important but I will not discuss that here].


And for a bit more completeness:

Obviously, FPR and FDR are not the only relevant metrics you can calculate with the four quantities in the confusion matrix. Of the many possible metrics that may be useful in different contexts, two relatively common ones that you are likely to encounter are:

True Positive Rate (TPR), also known as sensitivity, is the proportion of people who have the disease who are identified as having the disease.

$$ TPR = \frac{TP}{TP + FN} $$

True Negative Rate (TNR), also known as specificity, is the proportion of people who do not have the disease who are identified as not having the disease.

$$ TNR = \frac{TN}{TN + FP} $$

Related Question