Extent of “unscientific”, and of wrong, papers in research mathematics

"Are most areas safe, or contaminated?"

Most areas are fine. Probably all important areas are fine. Mathematics is fine. The important stuff is 99.99999% likely to be fine because it has been carefully checked. The experts know what is wrong, and the experts are checking the important stuff. The system works. The system has worked for centuries and continues to work.

My talk is an intentionally highly biased viewpoint to get people talking. It was in a talk in a maths department so I was kind of trolling mathematicians. I think that formal proof verification systems have the potential to offer a lot to mathematicians and I am very happy to get people talking about them using any means necessary. On the other hand when I am talking to the formal proofs people I put on my mathematician's hat and emphasize the paragraph above, saying that we have a human mathematical community which knows what it is doing better than any computer and this is why it would be a complete waste of time formalising a proof of Fermat's Last Theorem -- we all know it's true anyway because Wiles and Taylor proved it and since then we generalised the key ideas out of the park.

It is true that there are holes in some proofs. There are plenty of false lemmas in papers. But mathematics is robust in this extraordinary way. More than once in my life I have said to the author of a paper "this proof doesn't work" and their response is "oh I have 3 other proofs, one is bound to work" -- and they're right. Working out what is true is the hard, fun, and interesting part. Mathematicians know well that conjectures are important. But writing down details of an argument is a lot more boring than being imaginative and figuring out how the mathematical world works, and humans generally do a poorer job of this than they could. I am concerned that this will impede progress in the future when computers start to learn to read maths papers (this will happen, I guess, at some point, goodness knows when).

Another thing which I did not stress at all in the Pittsburgh talk but should definitely be mentioned is that although formal proof verification systems are far better when it comes to reliability of proofs, they have a bunch of other problems instead. Formal proofs need to be maintained, it takes gigantic libraries even to do the most basic things (check out Lean's definition of a manifold, for example), different systems are incompatible and systems die out. Furthermore, formal proof verification systems currently have essentially nothing to offer the working mathematician who understands the principles behind their area and knows why the major results in it are true. These are all counterpoints which I didn't talk about at all.

In the future we will find a happy medium, where computers can be used to help humans do mathematics. I am hoping that Tom Hales' Formal Abstracts project will one day start to offer mathematicians something which they actually want (e.g. good search for proofs, or some kind of useful database which actually helps us in practice).

But until then I think we should remember that there's a distinction between "results for which humanity hasn't written down the proof very well, but the experts know how to fill in all of the holes" and "important results which humanity believes and are not actually proved".

I guess one thing that worries me is that perhaps there are areas which are currently fashionable, have holes in, and they will become less fashionable, the experts will leave the area and slowly die out, and then all of a sudden someone will discover a hole which nobody currently alive knows how to fill, even though it might have been the case that experts could once do it.


As Kevin Buzzard himself admits in his answer, he somewhat exaggerated his point for effect.

However, I'd submit that if you were unsettled by his talk, then that's a good thing. I don't think that the proper reaction is to look for reassurance that mathematics really is fine, or that the problems are restricted to some easily quarantined corner.

Rather, I think the proper reaction is to strive for a more accurate view of the true state of the mathematical literature, and refuse to settle for comforting myths that aren't based on reality. Some of the literature is rock-solid and can stand on its own, much more of it is rock-solid provided you have access to the relevant experts, and some of it is gappy but we don't really care. On the other hand, some small percentage of it is gappy or wrong and we do care, but social norms within the mathematical community have caused us to downplay the problems. This last category is important. It is a small percentage, but from a scholarly point of view it is a serious problem, and we should all be aware of it and willing to acknowledge it. If, every time someone brings it up, we try to get them to shut up by repeating some "propaganda" that makes us feel good about mathematics, then we are not solving the problem but perpetuating it.

Some related concerns were raised over 25 years ago by Jaffe and Quinn in their article on Theoretical Mathematics. This generated considerable discussion at the time. Let me quote the first paragraph of Atiyah's response.

I find myself agreeing with much of the detail of the Jaffe–Quinn argument, especially the importance of distinguishing between results based on rigorous proofs and those which have a heuristic basis. Overall, however, I rebel against their general tone and attitude which appears too authoritarian.

My takeaway from this is that Jaffe and Quinn made many valid points, but because this is a sensitive subject, dealing with societal norms, we have to be very careful how we approach it. Given the way that the mathematical community currently works, saying that someone's work has gaps and/or mistakes is often taken to be a personal insult. I think that if, as a community, we were more honest about the fact that proofs are not simply right or wrong, complete or incomplete, but that there is a continuum between the two extremes, then we might be able to patch problems that arise more efficiently, because we wouldn't have to walk on eggshells.