Probability – The Abundance of P Values in the Absence of a Hypothesis

hypothesis testingp-valueprobabilitystatistical significance

I'm into epidemiology. I'm not a statistician but I try to perform the analyses myself, although I often encounter difficulties. I did my first analysis some 2 years ago. P values were included everywhere in my analyses (I simply did what other researchers were doing) from descriptive tables to regression analyses. Little by little, statisticians working in my apartment persuaded me to skip all (!) the p values, except from where I truly have a hypothesis.

The problem is that p values are abundant in medical research publications.
It is conventional to include p values on far too many lines; descriptive data of means, medians or whatever usually go along with p values (students t-test, Chi-square etc).

I've recently submitted a paper to a journal, and I refused (politely) to add p values to my "baseline" descriptive table. The paper was ultimately rejected.

To exemplify, see the figure below; it is the descriptive table from the latest published article in a respected journal of internal medicine.:
enter image description here

Statisticians are mostly (if not always) involved in the reviewing of these manuscripts. So a laymen like myself expects to not find any p values where there are no hypothesis. But they are abundant, but the reason for this remain elusive to me. I find it hard to believe that it is ignorance.

I realize that this is a borderline statistical question. But I'm looking for the rationale behind this phenomenon.

Best Answer

Clearly I don't need to tell you what a p-value is, or why over-reliance on them is a problem; you apparently understand those things quite well enough already.

With publishing, you have two competing pressures.

The first - and one you should push for at every reasonable opportunity - is to do what makes sense.

The second, ultimately, is the need to actually publish. There's little gain if nobody sees your fine efforts at reforming terrible practice.

So instead of avoiding it altogether:

  • do it as little of such pointless activity as you can get away with that still gets it published

  • maybe include a mention of this recent Nature methods article[1] if you think it will help, or perhaps better one or more of the other references. It at least should help establish that there's some opposition to the primacy of p-values.

  • consider other journals, if another would be suitable

Is this the same in other disciplines?

The problem of over-use of p-values occurs in a number of disciplines (this can even be a problem when there is some hypothesis), but is much less common in some than others. Some disciplines do have issues with p-value-itis, and the problems that causes can eventually lead to somewhat overblown reactions[2] (and to a smaller extent, [1], and at least in some places, a few of the others as well).

I think there are a variety of reasons for it, but the over-reliance of p-values seems to acquire a momentum of its own - there's something about saying "significant" and rejecting a null that people seem to find very attractive; various disciplines (e.g. see [3][4][5][6][7][8][9][10][11]) have (with varying degrees of success) been fighting against the problem of over reliance on p-values (especially $\alpha$=0.05) for many years, and have made many different kinds of suggestions - not all of which I agree with, but I include a variety of views to give some sense of the different things people have had to say.

Some of them advocate focusing on confidence intervals, some advocate looking at effect sizes, some advocate Bayesian methods, some smaller p-values, some just on avoiding using p-values in particular ways, and so on. There are many different views on what to do instead, but between them there's a lot of material on problems with relying on p-values, at least the way it's pretty commonly done.

See those references for many further references in turn. This is just a sampling - many dozens more references can be found. A few authors give reasons why they think p-values are prevalent.

Some of these references may be useful if you do want to argue the point with an editor.

[1] Halsey L.G., Curran-Everett D., Vowler S.L. & Drummond G.B. (2015),
"The fickle P value generates irreproducible results,"
Nature Methods 12, 179–185 doi:10.1038/nmeth.3288
http://www.nature.com/nmeth/journal/v12/n3/abs/nmeth.3288.html

[2] David Trafimow, D. and Marks, M. (2015),
Editorial,
Basic and Applied Social Psychology, 37:1–2
http://www.tandfonline.com/loi/hbas20
DOI: 10.1080/01973533.2015.1012991

[3] Cohen, J. (1990),
Things I have learned (so far),
American Psychologist, 45(12), 1304–1312.

[4] Cohen, J. (1994),
The earth is round (p < .05),
American Psychologist, 49(12), 997–1003.

[5] Valen E. Johnson (2013),
Revised standards for statistical evidence PNAS, vol. 110, no. 48, 19313–19317 http://www.pnas.org/content/110/48/19313.full.pdf

[6] Kruschke J.K. (2010),
What to believe: Bayesian methods for data analysis,
Trends in cognitive sciences 14(7), 293-300

[7] Ioannidis, J. (2005)
Why Most Published Research Findings Are False,
PLoS Med. Aug; 2(8): e124.
doi: 10.1371/journal.pmed.0020124

[8] Gelman, A. (2013), P Values and Statistical Practice,
EpidemiologyVol.24, No. 1, January, 69-72

[9] Gelman, A. (2013),
"The problem with p-values is how they're used",
(Discussion of “In defense of P-values,” by Paul Murtaugh, for Ecology) unpublished
http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.300.9053
http://www.stat.columbia.edu/~gelman/research/unpublished/murtaugh2.pdf

[10] Nuzzo R. (2014),
Statistical errors: P values, the 'gold standard' of statistical validity, are not as reliable as many scientists assume,
News and Comment,
Nature, Vol. 506 (13), 150-152

[11] Wagenmakers E, (2007)
A practical solution to the pervasive problems of p values,
Psychonomic Bulletin & Review 14(5), 779-804

Related Question