Solved – Statistics published in academic papers

academiapublication-bias

I read a lot of evolutionary/ecological academic papers, sometimes with the specific aim of seeing how statistics are being used 'in the real world' outside of the textbook. I normally take the statistics in papers as gospel and use the papers to help in my statistical learning. After all, if a paper has taken years to write and has gone through rigorous peer review, then surely the statistics are going to be rock solid? But in the past few days, I've questioned my assumption, and wondered how often the statistical analysis published in academic papers is suspect? In particular, it might be expected that those in fields such as ecology and evolution have spent less time learning statistics and more time learning their fields.

How often do people find suspect statistics in academic papers?

Best Answer

After all, if a paper has taken years to write and has gone through rigorous peer review, then surely the statistics are going to be rock solid?

My experience of reading papers that attempt to apply statistics across a wide variety of areas (political science, economics, psychology, medicine, biology, finance, actuarial science, accounting, optics, astronomy, and many, many others) is that the quality of the statistical analysis may be anywhere on the spectrum from excellent and well done to egregious nonsense. I have seen good analysis in every one of the areas I have mentioned, and pretty poorly done analysis in almost all of them.

Some journals are generally pretty good, and some can be more like playing darts with a blindfold - you might get most of them not too terribly far off the target, but there's going to be a few in the wall, the floor and the ceiling. And maybe the cat.

I don't plan on naming any culprits, but I will say I have seen academic careers built on faulty use of statistics (i.e. where the same mistakes and misunderstandings were repeated in paper after paper, over more than a decade).

So my advice is let the reader beware; don't trust that the editors and peer reviewers know what they're doing. Over time you may get a good sense of which authors can generally be relied on to not do anything too shocking, and which ones should be treated especially warily. You may get a sense that some journals typically have very high standard for their stats.

But even a typically good author can make a mistake, or referees and editors can fail to pick up errors they might normally find; a typically good journal can publish a howler.

[Sometimes, you'll even see really bad papers win prizes or awards... which doesn't say much for the quality of the people judging the prize, either.]

I wouldn't like to guess what the fraction of "bad" stats I might have seen (in various guises, and at every stage from defining the question, design of the study, data collection, data management, ... right through to analysis and conclusions), but it's not nearly small enough for me to feel comfortable.

I could point to examples, but I don't think this is the right forum to do that. (It would be nice if there was a good forum for that, actually, but then again, it would likely become highly "politicized" quite quickly, and soon fail to serve its purpose.)

I've spent some time trawling through PLOS ONE ... and again, not going to point at specific papers. Some things I noticed: it looks like a large proportion of papers have stats in them, probably more than half having hypothesis tests. The main dangers seem to be lots of tests, either with high $\alpha$ like 0.05 on each (which is not automatically a problem as long as we understand that quite a few really tiny effects might show up as significant by chance), or an incredibly low individual significance level, which will tend to give low power. I also saw a number of cases where about half a dozen different tests were apparently applied to resolving exactly the same question. This strikes me as a generally bad idea. Overall the standard was pretty good across a few dozen papers, but in the past I have seen an absolutely terrible paper there.

[Perhaps I could indulge in just one example, indirectly. This question asks about one doing something quite dubious. It's far from the worst thing I've seen.]

On the other hand, I also see (even more frequently) cases where people are forced to jump through all kinds of unnecessary hoops to get their analysis accepted; perfectly reasonable things to do are not accepted because there's a "right" way to do things according to a reviewer or an editor or a supervisor, or just in the unspoken culture of a particular area.

Related Question