[Physics] How do we know the LHC results are robust

data analysislarge-hadron-colliderparticle-physics

Nature article on reproducibility in science.

According to that article, a (surprisingly) large number of experiments aren't reproducible, or at least there have been failed attempted reproductions. In one of the figures, it's said that 70% of scientists in physics & engineering have failed to reproduce someone else's results, and 50% have failed to reproduce their own.

Clearly, if something cannot be reproduced, its veracity is called into question. Also clearly, because there's only one particle accelerator with the power of the LHC in the world, we aren't able to independently reproduce LHC results. In fact, because 50% of physics & engineering experiments aren't reproducible by the original scientists, one might expect there's a 50% chance that if the people who originally built the LHC built another LHC, they will not reach the same results. How, then, do we know that the LHC results (such as the discovery of the Higgs boson) are robust? Or do we not know the LHC results are robust, and are effectively proceeding on faith that they are?

EDIT: As pointed out by Chris Hayes in the comments, I misinterpreted the Nature article. It says that 50% of physical scientists have failed to reproduce their own results, which is not the same statement as 50% of physics experiments aren't reproducible. This significantly eases the concern I had when I wrote the question. I'm leaving the question here however, because the core idea – how can we know the LHC's results are robust when we only have one LHC? – remains the same, and because innisfree wrote an excellent answer.

Best Answer

That's a really great question. The 'replication crisis' is that many effects in social sciences (and, although to a lesser extent, other scientific fields) couldn't be reproduced. There are many factors leading to this phenomenon, including

  • Weak standards of evidence, e.g., $2\sigma$ evidence required to demonstrate an effect
  • Researchers (subconsciously or otherwise) conducting bad scientific practice by selectively reporting and publishing significant results. E.g. considering many different effects until they find a significant effect or collecting data until they find a significant effect.
  • Poor training in statistical methods.

I'm not entirely sure about the exact efforts that the LHC experiments are making to ensure that they don't suffer the same problems. But let me say some things that should at least put your mind at ease:

  • Particle physics typically requires a high-standard of evidence for discoveries ($5\sigma$). To put that into perspective, the corresponding type-1 error rates are $0.05$ for $2\sigma$ and about $3\times10^{-7}$ for $5\sigma$
  • The results from the LHC are already replicated!
    • There are several detectors placed around the LHC ring. Two of them, called ATLAS and CMS, are general purpose detectors for Standard Model and Beyond the Standard Model physics. Both of them found compelling evidence for the Higgs boson. They are in principle completely independent (though in practice staff switch experiments, experimentalists from each experiment presumably talk and socialize with each other etc, so possibly a very small dependence in analysis choices etc).
    • The Tevatron, a similar collider experiment in the USA operating at lower-energies, found direct evidence for the Higgs boson.
    • The Higgs boson was observed in several datasets collected at the LHC
  • The LHC (typically) publishes findings regardless of their statistical significance, i.e., significant results are not selectively reported.
  • The LHC teams are guided by statistical committees, hopefully ensuring good practice
  • The LHC is in principle committed to open data, which means a lot of the data should at some point become public. This is one recommendation for helping the crisis in social sciences.
  • Typical training for experimentalists at the LHC includes basic statistics (although in my experience LHC experimentalits are still subject to the same traps and misinterpretations as everyone else).
  • All members (thousands) of the experimental teams are authors on the papers. The incentive for bad practices such as $p$-hacking is presumably slightly lowered, as you cannot 'discover' a new effect and publish it only under your own name, and have improved job/grant prospects. This incentive might be a factor in the replication crisis in social sciences.
  • All papers are subject to internal review (which I understand to be quite rigorous) as well as external review by a journal
  • LHC analyses are often (I'm not sure who plans or decides this) blinded. This means that the experimentalists cannot tweak the analyses depending on the result. They are 'blind' to the result, make their choices, then unblind it only at the end. This should help prevent $p$-hacking
  • LHC analysis typically (though not always) report a global $p$-value, which has beeen corrected for multiple comparisons (the look-elsewhere effect).
  • The Higgs boson (or similar new physics) was theoretically required due to a 'no-lose' theorem about the breakdown of models without a Higgs at LHC energies, so we can be even more confident that it is a genuine effect. The other new effects that are being searched for at the LHC, however, arguably aren't as well motivated, so this doesn't apply to them. E.g., there was no a priori motivation for a 750 GeV resonanace that was hinted at in data but ultimately disappeared.

If anything, there is a suspicion that the practices at the LHC might even result in the opposite of the 'replication crisis;' analyses that find effects that are somewhat significant might be examined and tweaked until they decrease. In this paper it was argued this was the case for SUSY searches in run-1.

Related Question