[Math] Extent of “unscientific”, and of wrong, papers in research mathematics

erratajournalspeer-reviewproof-assistantssoft-question

This question is cross-posted from academia.stackexchange.com where it got closed with the advice of posting it on MO.


Kevin Buzzard's slides (PDF version) at a recent conference have really unsettled me.

In it, he mentions several examples in what one would imagine as very rigorous areas (e.g., algebraic geometry) where the top journals like Annals and Inventiones have published and never retracted papers which are now known to be wrong. He also mentions papers relying on unpublished results taken on trust that those who announced them indeed have a proof.

He writes about his own work:

[…] maybe some of my work in the p-adic Langlands philosophy relies
on stuff that is wrong. Or maybe, perhaps less drastically, on stuff
which is actually correct, but for which humanity does not actually
have a complete proof. If our research is not reproducible, is it
science? If my work in pure mathematics is neither useful nor 100
percent guaranteed to be correct, it is surely a waste of time.

He says that as a result, he switched to formalizing proofs completely, with e.g. Lean, which guarantees correctness, and thus reusability forever.

Just how widespread is the issue? Are most areas safe, or contaminated? For example, is there some way to track the not-retracted-but-wrong papers?


The answer I accepted on academia.stackexchange before the closure gives a useful general purpose method, but I'd really appreciate more detailed area-specific answers. For example, what fraction of your own papers do you expect to rely on a statement "for which humanity does not actually have a complete proof" ?

Best Answer

"Are most areas safe, or contaminated?"

Most areas are fine. Probably all important areas are fine. Mathematics is fine. The important stuff is 99.99999% likely to be fine because it has been carefully checked. The experts know what is wrong, and the experts are checking the important stuff. The system works. The system has worked for centuries and continues to work.

My talk is an intentionally highly biased viewpoint to get people talking. It was in a talk in a maths department so I was kind of trolling mathematicians. I think that formal proof verification systems have the potential to offer a lot to mathematicians and I am very happy to get people talking about them using any means necessary. On the other hand when I am talking to the formal proofs people I put on my mathematician's hat and emphasize the paragraph above, saying that we have a human mathematical community which knows what it is doing better than any computer and this is why it would be a complete waste of time formalising a proof of Fermat's Last Theorem -- we all know it's true anyway because Wiles and Taylor proved it and since then we generalised the key ideas out of the park.

It is true that there are holes in some proofs. There are plenty of false lemmas in papers. But mathematics is robust in this extraordinary way. More than once in my life I have said to the author of a paper "this proof doesn't work" and their response is "oh I have 3 other proofs, one is bound to work" -- and they're right. Working out what is true is the hard, fun, and interesting part. Mathematicians know well that conjectures are important. But writing down details of an argument is a lot more boring than being imaginative and figuring out how the mathematical world works, and humans generally do a poorer job of this than they could. I am concerned that this will impede progress in the future when computers start to learn to read maths papers (this will happen, I guess, at some point, goodness knows when).

Another thing which I did not stress at all in the Pittsburgh talk but should definitely be mentioned is that although formal proof verification systems are far better when it comes to reliability of proofs, they have a bunch of other problems instead. Formal proofs need to be maintained, it takes gigantic libraries even to do the most basic things (check out Lean's definition of a manifold, for example), different systems are incompatible and systems die out. Furthermore, formal proof verification systems currently have essentially nothing to offer the working mathematician who understands the principles behind their area and knows why the major results in it are true. These are all counterpoints which I didn't talk about at all.

In the future we will find a happy medium, where computers can be used to help humans do mathematics. I am hoping that Tom Hales' Formal Abstracts project will one day start to offer mathematicians something which they actually want (e.g. good search for proofs, or some kind of useful database which actually helps us in practice).

But until then I think we should remember that there's a distinction between "results for which humanity hasn't written down the proof very well, but the experts know how to fill in all of the holes" and "important results which humanity believes and are not actually proved".

I guess one thing that worries me is that perhaps there are areas which are currently fashionable, have holes in, and they will become less fashionable, the experts will leave the area and slowly die out, and then all of a sudden someone will discover a hole which nobody currently alive knows how to fill, even though it might have been the case that experts could once do it.