Solved – the correct terminology for one/two-tailed p-values and how to apply the Holm-Bonferroni correction

bonferronimultiple-comparisonsp-valueterminology

I have been struggling to reconcile my (very basic) understanding of P-values with the approach of one of my colleagues and it appears to come down to interpretation of p-value terminology. This is having a knock-on effect on the correct implementation of a Holm-Bonferroni correction – any advice would be welcome!

The problem is this:

We have run a statistical test to look at whether the expected number of observations is higher or lower than expected – thus it is a two tailed test (<2.5% or >97.5%). The number of observations was converted to a z-score (<-1.96 or >1.96) and then into a probability (P). For our reporting we had two options:

  1. Report the P-value as rejected if P<2.5% or P>97.5%
  2. Convert the P-value into a one-tailed value and reject if its P<5%

The issue arises because my colleague believes the following:

Option 2 is the more widely used definition. When we see "two-tailed p-value" referred to, this is usually what is meant. And when we apply Holm-Bonferroni, it is expecting rejection to correspond to p-value below 5%, so we need this form of the p-value.

and

What we report as a p-value under option 1 is what would universally be reported as the p-value for a one-tailed test, so can be referred to as a "one-tailed" p-value. The p-value under option 2 is what would generally be referred to as a "two-tailed" p-value, i.e. a p-value which has been converted to reflect the fact that the test is being run as a two-tailed test.

This is something I am struggling with – is this actually true because it seems completely counter-intuitive to me? I can find nothing on the internet to suggest that this is the case either.

This interpretation is critical because we go on to perform a Holm-Bonferroni correction (there are > 50 tests in total), which requires (what I understand to be) one-sided style P-values (P<5%) – thus I believe that our 'two-sided' p-value from option 1 should be divided by 2 (0.5*P or 0.5*(1-P)) to convert it appropriately before running the test. however my colleague believes it should be multiplied by 2 (2*P or 2*(1-P)) because Option 1 is actually known as a 'one-sided' value…

Can anyone offer any guidance on this to me as what I thought was a relatively simple concept is now confusing me greatly!

Best Answer

Most of what you've said is correct, but I think you might be confusing yourself unnecessarily by having different definitions of a one and two tailed test. Just to briefly review:

A one-tailed test is the probability that the area of your distribution ($t$, $Z$, $f$, etc.) is

  1. Above your observed test statistic
  2. Below your observed test statistic.

Thus if you are expected a negative effect (and using $t$ distribution as an example), your $P$-value would be equal to

$$ P(t \leq t_{observed}) $$

or for a positive effect,

$$ P(t \geq t_{observed}) $$

both with the appropriate degrees of freedom.

A two-tailed test is equal to the probability that the area of your distribution is EITHER above or below your observed test statistic, equal to

$$ P(t \geq t_{observed}) + P(t \leq t_{observed}) \\ P( t \geq | \, t_{observed} \, |) $$

Because the $t$ distribution is symmetrical around 0, the above two probabilities are equal. Therefore, the two tailed $P$ value is twice the one-tailed $P$ value.

Depending on the direction of effect, to convert your two-tailed $P$-value to a one tailed $P$-value, you must divide the former by 2.

$$ P_{one-tailed} = \begin{cases} (\frac{1}{2} P_{two-tailed}) & \text{ in the right direction} \\ (1-\frac{1}{2} P_{two-tailed}) & \text{ in the other direction} \end{cases} $$

Just the clarify, Holm-Bonferonni only requires that all $P$-values have the same null hypothesis. That is, either they are all two-sided or all one-sided. It doesn't make much sense to compare things with different nulls.

Related Question