Sunday, 16 March 2025

Book Review: Unreliable: Bias, Fraud, and the Reproducibility Crisis in Biomedical Research

by Csaba Szabo, Columbia University Press, 2025 


This is a rollicking good read, written in an informal style, and enlivened by cartoons, which works as a scholarly and accessible account of the so-called reproducibility crisis in biomedical research. 

I first became aware of this book back in February 2024 when the publisher asked me to review the draft. By happy coincidence I had just submitted an (ultimately unsuccessful) application for funding for a meeting on the closely-related topic of research fraud, and as I devoured the text, I felt guilty that I had not been aware of Szabo's work. As it turned out, there was a good reason for my ignorance: he had not previously written anything on this topic. As he explains in the Afterword, there is a personal backstory: 
The journey of a starryeyed young scientist entering the field of science, and continuing with the scientist working hard and hopefully contributing to the field over thirty years, but during all this time gradually realizing that the entire system suffers from major problems. And now that same scientist—not so young anymore, unfortunately—has written a book concluding that about 30 percent of the papers that come out every year are fake garbage and that 70 to 90 percent of the published scientific literature is not reproducible. 
That is a startling statement, but Szabo speaks with authority, as one who has always loved science, and has had a long and distinguished career in biomedical research both in the US and in Europe. It is clear that he does not want to attack science: as he points out, it is the only game in town. But he is dismayed at how the scientific method has been degraded, and he is concerned that nobody in power is taking responsibility for cleaning it up. 

What makes the book unlike any other on this topic is the detailed account of how we got into the current state, and what are the barriers to remedying the situation. Szabo spends some time explaining how hypercompetition for research grants drives the behaviour of researchers. Although it is customary to talk of "publish or perish", Szabo argues that the real crunch point for a biomedical scientist is success in obtaining external grants. In the USA, this usually means NIH funding in the form of a R01 grant, typically around $1 million over a 4-5 year period. While that sounds like a lot of money, it has to cover 50% of the salary of the principal investigator as well as other salaries and research supplies, which means it is not enough to support a research group. The success rate is around 20%, and researchers typically need to submit numerous proposals in order to survive. The institutions want their staff to obtain grants, not just so they can bathe in the reflected glory of impressive research results, but also because grants bring overheads in the form of indirect costs. So getting grants is extremely high-stakes.

It gets even more interesting when Szabo documents his experiences as a grant reviewer for NIH study sections. We might hope that the most successful proposals are the ones that are realistic, avoid hype, and carefully document the reliability of their methods. Alas, this is the opposite of what happens. Such is the pressure for novelty and impact, that anyone who proposed to replicate a prior finding would be quickly triaged out of the competition. I've noted similar tendencies in the UK context. Thus, the message researchers get from institutions is, if you want to keep your job, get a grant, and the message they get from funders is, if you want to get a grant, concentrate on making the research look exciting.

Szabo next moves on to discuss the way science is done in the lab. There are numerous factors that conspire to make findings irreproducible. Some of these are related to the inherent variability of biological systems, but some arise because of failure to adopt experimental designs that adequately control for bias. Typically, doing things meticulously takes time, and the pressure is to get results out fast. Furthermore, many studies involve a mixture of complicated methods, and the principal investigator may not understand all of them. When it comes to analysing the results, there is huge scope for adopting methods such as post-hoc outlier exclusion, p-hacking, and HARKing. All of these can be used to squeeze positive findings out of an unpromising dataset. If biomedical science is like psychology, many researchers regard such methods as normative and are unaware just how much they contribute to lack of reproducibility of published results. 

The next chapter takes a darker turn, moving to intentional fraud. A number of high-profile cases are reviewed, as well as the industrial-scale fraudulent operations run by so-called paper mills. This chapter is particularly depressing as Szabo takes us through the analogies that have been used to characterise fraud.  Initially, fraud was seen as rare, due to a few "bad apples"; later it was compared to an iceberg of fraudulent work, where we only see the tip but need to be aware that much is hidden. Szabo writes: 
In my view, even this analogy is severely misleading. If we want to stay with nature analogies, my feeling is that we are dealing with a big scientific swamp, with various swamp creatures of different sizes and shapes living in it. There are some relatively clean areas of water, too, and there are regular life forms as well. But there are an awful lot of swamp creatures who happily coexist in their natural environment, taking away food and resources from the regular life forms. In addition, a whole ecosystem built around the swamp is benefiting from it. The people who are supposed to manage the swamp, or perhaps drain it, are nowhere to be found. 
One group of people who might be expected to manage the swamp are those who publish research papers, but Szabo does not find them equal to the task, instead talking of "A broken scientific publishing system". Regular readers of this blog will be well-acquainted with the phenomenon whereby someone reports an obvious problem with a published paper, only to be ignored. Academic publishers are making efforts to screen new submissions for plagiarism and image manipulation, but there seems little appetite for cleaning up the existing body of scientific literature. Until that is done, we cannot regard it as a foundation for future work. 

Szabo is impressed by the efforts of "data sleuths", who perform post-publication peer review and report problems on the PubPeer website, but he regards this as unsustainable: and cleaning up the literature should not be a task for volunteers.  It seems that everyone wants someone to "do something" to fix the problem, but nobody takes it on. Organisations with some responsibility include universities, research institutes, publishers, editors and funders. Szabo's recommendations for change focus on funders, who have the power to deny funding to those who fail to take steps to ensure that their results are reliable. And ultimately, the money for research comes from taxpayers, and governments call the shots. 

This is a particularly difficult time to be conveying such a message. The only people who might be overjoyed to hear that a high proportion of published research is unreliable are politicians who are antagonistic to science and would like an excuse to defund it. In the USA, cuts to funding have been so fast and so deep that many are fearful that the science base may not recover. The swamp creatures may die, but so will the regular life forms. We urgently need therefore to look seriously at recommendations for changing how the system works at all levels - laboratory practice, funding, institutional integrity investigations, publishing, incentive structures - so that we can not only have confidence in scientific findings, but also defend science against attacks. 

I don't agree with all of Szabo's recommendations, but it is refreshing to have someone take a deep dive into the topic, and his ideas form a good basis for discussion. One point where I have a different approach concerns the emphasis on replications. There are many people arguing that more funding should be directed towards replicating prior studies. In the short term, that will be needed, because the way research has been done means we don't know which findings are solid. But in many areas it takes a large amount of time and money to replicate a study. The whole point of the statistical and experimental methods used in science is that they should allow us to assign a level of confidence in our findings without needing to perform an explicit replication. The problem is that we have misapplied those methods. The Registered Reports approach, where a study is evaluated by reviewers and accepted or rejected by a journal on the basis of introduction, methods and analysis plan, before any data are gathered, gets rid of the biases due to p-hacking, HARKing and publication bias, making it possible to interpret statistics sensibly. It also leads to improved methods overall, because independent reviewers offer feedback at a point when it can be helpful. As far as I know, the Registered Reports model has not been adopted by biomedicine, but could, I think, transform the field to make it more rigorous. 

Finally, I'm pleased to say that despite the initial rejection, I managed eventually to secure funding for that meeting on research fraud, which will be held in Oxford from 7th-9th April 2025, and will provide a great opportunity for discussing these issues. Registration is open for a few more days, so please consider attending (online or in person) if you'd like to take part. More details and registration form here: (turn off VPN if it does not load).