Solved – Doubling &/or halving p-values for one- vs. two-tailed tests

hypothesis testingp-value

Let's say I'm doing a two-tailed hypothesis test at 5% significance level and get a test statistic that corresponds to a p-value of $0.03$. As it is two-tailed I double it and therefore, as $0.06 > 0.05$, I fail to reject the null hypothesis.

However, let's say now I'm checking for a result strictly greater than the mean so it's now a one-tailed test. At 5% significance level, do I now reject my null hypothesis as $0.03 < 0.05$ or as we're only doing a one-tailed test do I halve $0.05$ and check against that?

Best Answer

If you do a two-tailed test and computation gives you $p=0.03$, then $p<0.05$. The result is significant. If you do a one-tailed test, you will get a different result, depending on which tail you investigate. It will be either a lot larger or only half as big.

$\alpha=0.05$ is the usual convention, no matter whether you test one- ode two-tailed. You don't halve that (except maybe in Bonferroni-correction, which is not the topic here). Thus yes, sometimes a one-tailed test will give you a significant result where the two-tailed does not. However, this is not how things work: You have to always determine upfront, whether you consider a one- or a two-tailed test appropriate as you have to determine your $\alpha$-level upfront. Then you calculate the $p$-value for that test and there are no more degrees of freedom how to test or what to compare the $p$-value to. If you determine on the sidedness of your test depending on whether you like the result, this is not good scientific practice.

That being said, there is hardly ever a situation where it is appropriate to test one-tailed. In far most circumstances it would be worth communicating a significant result in both directions. If you test one-tailed, some of your audience will consider this a trick to hack your $p$-value into being as small as possible.