Benjamini-Hochberg Correction – P-Values or Q-Values?

hypothesis testing

Given a list of p-values generated from independent tests, sorted in ascending order, one can use the Benjamini-Hochberg procedure for multiple testing correction. For each p-value, the Benjamini-Hochberg procedure allows you to calculate the False Discovery Rate (FDR) for each of the p-values. That is, at each "position" in the sorted list of p-values, it will tell you what proportion of those are likely to be false rejections of the null hypothesis.

My question is, are these FDR values to be referred to as "q-values", or as "corrected p-values", or as something else entirely?

EDIT 2010-07-12: I would like to more fully describe the correction procedure we are using. First, we sort the test results in increasing order by their un-corrected original p-value. Then, we iterate over the list, calculating what I have been interpreting as "the FDR expected if we were to reject the null hypothesis for this and all tests prior in the list," using the B-H correction, with an alpha equal to the observed, un-corrected p-value for the respective iteration. We then take, as what we've been calling our "q-value", the maximum of the previously corrected value (FDR at iteration i – 1) or the current value (at i), to preserve monotonicity.

Below is some Python code which represents this procedure:

def calc_benjamini_hochberg_corrections(p_values, num_total_tests):
    """
    Calculates the Benjamini-Hochberg correction for multiple hypothesis
    testing from a list of p-values *sorted in ascending order*.

    See
    http://en.wikipedia.org/wiki/False_discovery_rate#Independent_tests
    for more detail on the theory behind the correction.

    **NOTE:** This is a generator, not a function. It will yield values
    until all calculations have completed.

    :Parameters:
    - `p_values`: a list or iterable of p-values sorted in ascending
      order
    - `num_total_tests`: the total number of tests (p-values)

    """
    prev_bh_value = 0
    for i, p_value in enumerate(p_values):
        bh_value = p_value * num_total_tests / (i + 1)
        # Sometimes this correction can give values greater than 1,
        # so we set those values at 1
        bh_value = min(bh_value, 1)

        # To preserve monotonicity in the values, we take the
        # maximum of the previous value or this one, so that we
        # don't yield a value less than the previous.
        bh_value = max(bh_value, prev_bh_value)
        prev_bh_value = bh_value
        yield bh_value

Best Answer

As Robin said, you've got the Benjamini-Hochberg method backwards. With that method, you set a value for Q (upper case Q; the maximum desired FDR) and it then sorts your comparisons into two piles. The goal is that no more than Q% of the comparisons in the "discovery" pile are false, and thus at least 100%-Q% are true.

If you computed a new value for each comparison, which is the value of Q at which that comparisons would just barely be considered a discovery, then those new values are q-values (lower case q; see the link to a paper by John Storey in the original question).

Related Question