Sunday, 19 November 2023

Defence against the dark arts: a proposal for a new MSc course

 


Since I retired, an increasing amount of my time has been taken up with investigating scientific fraud. In recent months, I've become convinced of two things: first, fraud is a far more serious problem than most scientists recognise, and second, we cannot continue to leave the task of tackling it to volunteer sleuths. 

If you ask a typical scientist about fraud, they will usually tell you it is extremely rare, and that it would be a mistake to damage confidence in science because of the activities of a few unprincipled individuals. Asked to name fraudsters they may, depending on their age and discipline, mention Paolo Macchiarini, John Darsee, Elizabeth Holmes or Diederik Stapel, all high profile, successful individuals, who were brought down when unambiguous evidence of fraud was uncovered. Fraud has been around for years, as documented in an excellent book by Horace Judson (2004), and yet, we are reassured, science is self-correcting, and has prospered despite the activities of the occasional "bad apple". The problem with this argument is that, on the one hand, we only know about the fraudsters who get caught, and on the other hand, science is not prospering particularly well - numerous published papers produce results that fail to replicate and major discoveries are few and far between (Harris, 2017). We are swamped with scientific publications, but it is increasingly hard to distinguish the signal from the noise. In my view, it is getting to the point where in many fields it is impossible to build a cumulative science, because we lack a solid foundation of trustworthy findings. And it's getting worse and worse.

My gloomy prognosis is partly engendered by a consideration of a very different kind of fraud: the academic paper mill. In contrast to the lone fraudulent scientist who fakes data to achieve career advancement, the paper mill is an industrial-scale operation, where vast numbers of fraudulent papers are generated, and placed in peer-reviewed journals with authorship slots being sold to willing customers. This process is facilitated in some cases by publishers who encourage special issues, which are then taken over by "guest editors" who work for a paper mill. Some paper mill products are very hard to detect: they may be created from a convincing template with just a few details altered to make the article original. Others are incoherent nonsense, with spectacularly strange prose emerging when "tortured phrases" are inserted to evade plagiarism detectors.

You may wonder whether it matters if a proportion of the published literature is nonsense: surely any credible scientist will just ignore such material? Unfortunately, it's not so simple. First, it is likely that the paper mill products that are detected are just the tip of the iceberg - a clever fraudster will modify their methods to evade detection. Second, many fields of science attempt to synthesise findings using big data approaches, automatically combing the literature for studies with specific keywords and then creating databases, e.g. of genotypes and phenotypes. If these contain a large proportion of fictional findings, then attempts to use these databases to generate new knowledge will be frustrated. Similarly, in clinical areas, there is growing concern that systematic reviews that are supposed to synthesise evidence to get at the truth instead lead to confusion because a high proportion of studies are fraudulent. A third and more indirect negative consequence of the explosion in published fraud is that those who have committed fraud can rise to positions of influence and eminence on the back of their misdeeds. They may become editors, with the power to publish further fraudulent papers in return for money, and if promoted to professorships they will train a whole new generation of fraudsters, while being careful to sideline any honest young scientists who want to do things properly. I fear in some institutions this has already happened.

To date, the response of the scientific establishment has been wholly inadequate. There is little attempt to proactively check for fraud: science is still regarded as a gentlemanly pursuit where we should assume everyone has honourable intentions. Even when evidence of misconduct is strong, it can take months or years for a paper to be retracted. As whistleblower Raphaël Levy asked on his blog: Is it somebody else's problem to correct the scientific literature? There is dawning awareness that our methods for hiring and promotion might encourage misconduct, but getting institutions to change is a very slow business, not least because those in positions of power succeeded in the current system, and so think it must be optimal.

The task of unmasking fraud is largely left to hobbyists and volunteers, a self-styled army of "data sleuths", who are mostly motivated by anger at seeing science corrupted and the bad guys getting away with it. They have developed expertise in spotting certain kinds of fraud, such as image manipulation and improbable patterns in data, and they have also uncovered webs of bad actors who have infiltrated many corners of science. One might imagine that the scientific establishment would be grateful that someone is doing this work, but the usual response to a sleuth who finds evidence of malpractice is to ignore them, brush the evidence under the carpet, or accuse them of vexatious behaviour. Publishers and academic institutions are both at fault in this regard.

If I'm right, this relaxed attitude to the fraud epidemic is a disaster-in-waiting. There are a number of things that need to be done urgently. One is to change research culture so that rewards go to those whose work is characterised by openness and integrity, rather than those who get large grants and flashy publications. Another is for publishers to act far more promptly to investigate complaints of malpractice and issue retractions where appropriate. Both of these things are beginning to happen, slowly. But there is a third measure that I think should be taken as soon as possible, and that is to train a generation of researchers in fraud busting. We owe a huge debt of gratitude to the data sleuths, but the scale of the problem is such that we need the equivalent of a police force rather than a volunteer band. Here are some of the topics that an MSc course could cover:

  • How to spot dodgy datasets
  • How to spot manipulated figures
  • Textual characteristics of fraudulent articles
  • Checking scientific credentials
  • Checking publisher credentials/identifying predatory publishers
  • How to raise a complaint when fraud is suspected
  • How to protect yourself from legal attacks
  • Cognitive processes that lead individuals to commit fraud
  • Institutional practices that create perverse incentives
  • The other side of the coin: "Merchants of doubt" whose goal is to discredit science

I'm sure there's much more that could be added and would be glad of suggestions. 

Now, of course, the question is what could you do with such a qualification. If my predictions are right, then individuals with such expertise will increasingly be in demand in academic institutions and publishing houses, to help ensure the integrity of work they produce and publish. I also hope that there will be growing recognition of the need for more formal structures to be set up to investigate scientific fraud and take action when it is discovered: graduates of such a course would be exactly the kind of employees needed in such an organisation.

It might be argued that this is a hopeless endeavour. In Harry Potter and the Half-Blood Prince (Rowling, 2005) Professor Snape tells his pupils:

 "The Dark Arts, are many, varied, ever-changing, and eternal. Fighting them is like fighting a many-headed monster, which, each time a neck is severed, sprouts a head even fiercer and cleverer than before. You are fighting that which is unfixed, mutating, indestructible."

This is a pretty accurate description of what is involved in tackling scientific fraud. But Snape does not therefore conclude that action is pointless. On the contrary, he says: 

"Your defences must therefore be as flexible and inventive as the arts you seek to undo."

I would argue that any university that wants to be ahead of the field in this enterprise could should flexibility and inventiveness in starting up a postgraduate course to train the next generation of fraud-busting wizards. 

Bibliography

Bishop, D. V. M. (2023). Red flags for papermills need to go beyond the level of individual articles: A case study of Hindawi special issues. https://osf.io/preprints/psyarxiv/6mbgv
Boughton, S. L., Wilkinson, J., & Bero, L. (2021). When beauty is but skin deep: Dealing with problematic studies in systematic reviews | Cochrane Library. Cochrane Database of Systematic Reviews, 5. Retrieved 4 June 2021, from https://www.cochranelibrary.com/cdsr/doi/10.1002/14651858.ED000152/full
 Byrne, J. A., & Christopher, J. (2020). Digital magic, or the dark arts of the 21st century—How can journals and peer reviewers detect manuscripts and publications from paper mills? FEBS Letters, 594(4), 583–589. https://doi.org/10.1002/1873-3468.13747
Cabanac, G., Labbé, C., & Magazinov, A. (2021). Tortured phrases: A dubious writing style emerging in science. Evidence of critical issues affecting established journals (arXiv:2107.06751). arXiv. https://doi.org/10.48550/arXiv.2107.06751
Carreyrou, J. (2019). Bad Blood: Secrets and Lies in a Silicon Valley Startup. Pan Macmillan.
COPE & STM. (2022). Paper mills: Research report from COPE & STM. Committee on Publication Ethics and STM. https://doi.org/10.24318/jtbG8IHL 
Culliton, B. J. (1983). Coping with fraud: The Darsee Case. Science (New York, N.Y.), 220(4592), 31–35. https://doi.org/10.1126/science.6828878 
Grey, S., & Bolland, M. (2022, August 18). Guest Post—Who Cares About Publication Integrity? The Scholarly Kitchen. https://scholarlykitchen.sspnet.org/2022/08/18/guest-post-who-cares-about-publication-integrity/ 
Hanson, M., Gómez Barreiro, P., Crosetto, P., & Brockington, D. (2023). The strain on scientific publishing (2309; p. 33343265 Bytes). arXiv. https://arxiv.org/ftp/arxiv/papers/2309/2309.15884.pdf 
Harris, R. (2017). Rigor Mortis: How Sloppy Science Creates Worthless Cures, Crushes Hope, and Wastes Billions (1st edition). Basic Books.

Judson, H. F. (2004). The Great Betrayal. Orlando.

Lévy, R. (2022, December 15). Is it somebody else’s problem to correct the scientific literature? Rapha-z-Lab. https://raphazlab.wordpress.com/2022/12/15/is-it-somebody-elses-problem-to-correct-the-scientific-literature/
 Moher, D., Bouter, L., Kleinert, S., Glasziou, P., Sham, M. H., Barbour, V., Coriat, A.-M., Foeger, N., & Dirnagl, U. (2020). The Hong Kong Principles for assessing researchers: Fostering research integrity. PLOS Biology, 18(7), e3000737. https://doi.org/10.1371/journal.pbio.3000737
 Oreskes, N., & Conway, E. M. (2010). Merchants of Doubt: How a handful of scientists obscured the truth on issues from tobacco smoke to global warming. Bloomsbury Press.
 Paterlini, M. (2023). Paolo Macchiarini: Disgraced surgeon is sentenced to 30 months in prison. BMJ, 381, p1442. https://doi.org/10.1136/bmj.p1442  
Rowling, J. K. (2005) Harry Potter and the Half-Blood Prince. Bloomsbury, London. ‎ ISBN: 9780747581086
Smith, R. (2021, July 5). Time to assume that health research is fraudulent until proven otherwise? The BMJ. https://blogs.bmj.com/bmj/2021/07/05/time-to-assume-that-health-research-is-fraudulent-until-proved-otherwise/
Stapel, D. (2016). Faking science: A true story of academic fraud.  Translated by Nicholas J. Brown. http:// nick.brown.free.fr/stapel.
Stroebe, W., Postmes, T., & Spears, R. (2012). Scientific misconduct and the myth of self-correction in science. Perspectives on Psychological Science, 7(6), 670–688. https://doi.org/10.1177/1745691612460687
 

Note: On-topic comments are welcome but are moderated to avoid spam, so there may be a delay before they appear.

9 comments:

  1. You rightly flag up big data approaches to synthesise findings, something that might be used to find new drug targets or whatever in cancer research for example. So... should your course also have at least a taster of machine learning (AI) awareness or something, so that students know what's going on?

    ReplyDelete
    Replies
    1. This is Dorothy Bishop replying - having trouble signing in! I would anticipate that people interested in this course would come from a range of backgrounds, and would tend to be people with good data processing skills and include some people with expertise in AI. Yes, you can't really understand problems without knowing underlying methods and we know there are issues with reproducibility of ML research, for instance. But another concern is with the individual studies that get into big data datasets, which may use quite basic statistical methods. Another concern is use of AI language to obfuscate: type "gobbledegook sandwich" into the PubPeer search box and you will see what I mean.

      Delete
  2. I'm charmed by this idea, and once upon a time I might have signed up for such a course. But I note that teachers of this art in the original series tended to meet rather unfortunate ends, and I worry that in this your metaphor is a little bit too apt. As you note, science as a whole doesn't currently have great incentives in place for ensuring that what is published is true. Cheaters get tenure at Harvard; fraud sleuths get sued; I don't think this is an accident. I don't think universities (and certainly not publishing houses) would have reason to change the arrangement until not doing so becomes prohibitively expensive. We'd need an adversary. A William Proxmire for the twenty-first century? It makes me squirm to think of rooting for that.

    ReplyDelete
  3. I'm curious about the immediate, low-hanging techniques to assess research fraud. Serendipitously, I've been using plagiarism detectors, GRIM/SPRITE/DEBIT, statcheck, and related tools as part of a broader project to create a peer review guide.

    For your first three bullet points, I've found that latests LLMs (e.g. ChatGPT) to be quite helpful. Maybe there are ways to integrate them into peer review software.

    ReplyDelete
    Replies
    1. Hi I'm a creator of SciScore an AI tool that works with peer review to flag mainly errors of omission for rigor and reproducibility items in manuscripts. I would love to see more tools develop and be used in the manuscript review process. Though it was difficult to integrate our tools with EM and eJP, Scholar One has been impossible. For peer review tools to be functional we really need a way to get past these gatekeepers. Very few toolmakers that create cool tools are allowed to offer them to the journals.

      Delete
  4. I like the idea! But if we just teach our students the mechanics of fraud detection we run the risk of making them better cheaters. I had this epiphany while teaching about p-hacking in my statistics class. We also need to instill in them what Richard Feynman referred to as "a kind of scientific integrity, a principle of scientific thought that corresponds to a kind of utter honesty—a kind of leaning over backwards." I know that this should go without saying, but I think it also needs to be said . . . repeatedly. We need to teach them the difference between "getting a result" in the short term and "getting it right" in the long run. My current idea is to have them read Feynman's "Cargo cult science" speech (the source of the above quote) and then discuss it in class. I'd be curious to hear others.

    ReplyDelete
  5. At least every researcher disclosing misconducts related to his own results should try to do something though this looks to be a full-time job... An example : http://cristal.org/Mesli-et-al.pdf

    ReplyDelete
  6. Another factor in play here is what appears to be a decline in the rigour (and hence quality) of peer review. Without incentive to devote much time for reviewing submissions (better off to write up your own work) the tendency is for reviewing to become superficial and sometimes just a quick 'wave it through'. (This of course is not restricted to detecting fraud, which requires specialist skills or dogged devotion.)

    Geoff Hammond

    ReplyDelete
  7. An excellent idea! Perhaps someone who has the clout (like a Prof) should organise an international conference on methods for the detection of scientific fraud. In clinical trials one reason for choosing a set of core outcomes might be to form a set within which the pattern of relationships of variables, i.e. the 'face' of such multivariate data, is hard to convincingly simulate or forge. The contributions and edited comments from such a conference could be published as a resource for such courses.

    ReplyDelete