Wednesday, 12 October 2022

What is going on in Hindawi special issues?

A guest blogpost by Nick Wise 

 http://www.eng.cam.ac.uk/profiles/nhw24


The Hindawi journal Wireless Communications and Mobile Computing is booming. Until a few years ago they published 100-200 papers a year, however they published 269 papers in 2019, 368 in 2020 and 1,212 in 2021. So far in 2022 they have published 2,429. This growth has been achieved primarily by the creation of special issues, which makes sense. It would be nearly impossible for a journal to increase its publication rate by an order of magnitude in 2 years without outsourcing the massive increase in workload to guest editors.

Recent special issues include ‘Machine Learning Enabled Signal Processing Techniques for Large Scale 5G and 5G Networks’ (182 articles), ‘Explorations in Pattern Recognition and Computer Vision for Industry 4.0’ (244) and ‘Fusion of Big Data Analytics, Machine Learning and Optimization Algorithms for Internet of Things’ (204). Each of these special issues contains as many papers as the journal published in a year until recently. They also contain many papers that are flagged on Pubpeer for irrelevant citations, tortured phrases and surprising choices of corresponding email addresses.

However, I am going to focus on one special issue that is still open for submissions, and so far contains a modest 62 papers: ‘AI-Driven Wireless Energy Harvesting in Massive IoT for 5G and Beyond’, edited by Hamurabi Gamboa Rosales, Danijela Milosevic and Dijana Capeska Bogatinoska. Given the title of the special issue, it is perhaps surprising that only two of the articles contain ‘wireless’ in the title and none contain ‘energy’. The authors of the other papers (or whoever submitted them) appear to have realised that as long as they included the buzzwords ‘AI’, ‘IoT’ (Internet of Things) or ‘5G’ in the title, the paper could be about anything at all. Hence, the special issue contains titles such as:

  • Analysis Model of the Guiding Role of National Sportsmanship on the Consumer Market of Table Tennis and Related IoT Applications 
  • Evaluation Method of the Metacognitive Ability of Chinese Reading Teaching for Junior Middle School Students Based on Dijkstra Algorithm and IoT Applications 
  • The Construction of Shared Wisdom Teaching Practice through IoT Based on the Perspective of Industry-Education Integration

Of the 62 papers, 60 give Hamurabi Gamboa Rosales as the academic editor and 2 give Danijela Milosevic. Why is the distribution of labour so lopsided? One can imagine an arrangement where the lead editor does the admin of waving through irrelevant papers and the other 2 guest editors get to say that they’ve guest-edited a special issue on their CV.

Of course, in addition to boosting publication numbers for the authors and providing CV points for the guest editors, every paper in the special issue has a references section. Each reference gives someone a citation, another academic brownie point on which careers can be built. An anonymous Pubpeer sleuth has trawled through the references section of every paper in this special issue and found that Malik Bader Alazzam of Amman Arab University in Jordan has been cited 139 times across the 62 papers. The chance that the authors of almost every article would independently decide to cite the same person seems small.

The most intriguing fact about the papers in the special issue however, is that only 4 authors give corresponding email addresses that match their affiliation. These 4 include the only 3 papers with non-Chinese authors. Of the other 58, 1 uses an email address from Guangzhou University, 6 use email addresses from Changzhou University, and 51 use email addresses from Ma’anshan University. All of the Ma’anshan addresses are of the form 1940XXXX@masu.edu.cn and many are nearly sequential, suggesting that someone somewhere purchased a block of sequential email addresses (you do not need to be at Ma’anshan University to have an @masu email address). The screenshot below shows a sample (the full dataset is linked here).

A subset of the titles from the special issue with their corresponding email addresses, all of the form 1940XXXX@masu.edu.cn

The use and form of the email addresses suggests that all of these papers are the work of a paper mill. It is hard to imagine otherwise how 51 different authors could submit papers to the same special issue using the same institutional email domain and format. Indeed, before 2022 only 2 papers had ever used @masu.edu.cn as a corresponding address according to Dimensions. It is equally hard to imagine how Hamurabi Gamboa Rosales is unaware. How can you not notice that, of the 19 papers you receive for your special issue on the 12th of July, 18 use the same email domain that doesn’t match their affiliation? This may also explain why Hamurabi has dealt with almost all the papers himself. This special issue should be closed for submissions and an investigation begun.

Stepping back from this special issue, this is not an isolated problem. There are at least 40 other papers published in Wireless Communications and Mobile Computing with corresponding emails from Ma’anshan, and Dimensions finds there are 46 in Computational Intelligence and Neuroscience, 38 in Computational and Mathematical Methods in Medicine and 30 in Mobile Information Systems, all published in 2022 and all in Hindawi journals. What are the chances that 18404032@masu.edu.cn is used in a special issue in Computational Intelligence and Neuroscience, 18404038@masu.edu.cn in Disease Markers and 18404041@masu.edu.cn in Wireless Communications and Mobile Computing?

Finally, masu.edu.cn is only one example of a commonly used email domain that doesn’t match the author’s affiliation. It is conceivable that the entire growth in publications of Wireless Communications and Mobile Computing, Computational Intelligence and Neuroscience (163 articles in 2020, 3,079 in 2022) and Computational and Mathematical Methods in Medicine (225 in 2020, 1,488 in 2022) is from paper mills publishing in corrupted special issues.


Nick Wise


*All numbers accurate as of the 12th October 2022.

Tuesday, 4 October 2022

A desire for clickbait can hinder an academic journal's reputation

 


On 28th September, I woke up to look at Twitter and find Pete Etchells fulminating about a piece in the Guardian.  

It was particularly galling for him to read a piece that implied research studies had shown that voice-responsive devices were harming children’s development when he and Amy Orben had provided comments to the Science Media Centre that were available to the journalist. They both noted that: 

a) This was a Viewpoint piece, not new research 

b) Most of the evidence it provided consisted of anecdotes from newspaper articles

I agreed with Pete’s criticism of the Guardian, but having read the original Viewpoint in the Archives of Disease in Childhood, I had another question, namely, why on earth was a reputable paediatrics journal doing a press release on a flimsy opinion piece written by two junior medics with no track record in the area? 

So I wrote to the Editor with my concerns, as follows: 

Dear Dr Brown 

Viewpoint: Effects of smart voice control devices on children: current challenges and future perspectives doi 10.1136/archdischild-2022-323888 Journal: Archives of Disease in Childhood  

I am writing to enquire why this Viewpoint was sent out to the media under embargo as if it was a substantial piece of new research. I can understand that you might want to publish less formal opinion pieces from time to time, but what I cannot understand is the way this was done to attract maximum publicity by the media. 

The two people who commented about it for the Science Media Centre both noted this was an opinion piece with no new evidence, relying mainly on media reports. 

https://www.sciencemediacentre.org/expert-reaction-to-an-opinion-piece-on-voice-controlled-devices-and-child-development/ 

Unfortunately, despite this warning, it has been picked up by the mainstream media, where it is presented as ‘new research’, which will no doubt give parents of young children something new to worry about. 

I checked out the authors, and found these details: 

https://orcid.org/0000-0003-4881-8293 

https://www.researchgate.net/profile/Ananya-Arora-3 

These confirm that neither has a strong research track record, or any evidence of expertise in the topic of the Viewpoint. I can only assume that ADC is desperate for publicity at any cost, regardless of scientific evidence or impact on the public. 

As an Honorary Fellow of the Royal College of Paediatrics and Child Health, and someone who has previously published in ADC, I am very disappointed to see the journal sink so low. 

Yesterday I got a reply that did nothing to address my concerns. Here’s what the editor, Nick Brown*, said (in italic), with my reactions added: 

Thank you for making contact . My response reflects the thoughts of both the BMJ media and publication departments  

Given my reactions, below, this is more worrying than reassuring. It would be preferable to have heard that there had been some debate as to the wisdom of promoting this article to the press. 

It is a key role of a scientific journal to raise awareness of, and stimulate debate on, live and emerging issues. Voice control devices are becoming increasingly common and their impact on children's development is a legitimate topic of discussion.  

I have no quarrel with the idea that impact of voice control devices on children is a legitimate topic for the journal. But I wonder about how far its role is ‘raising awareness of, and stimulating debate’ when the topic is one on which we have very little evidence. A scientific journal might be expected to provide a balanced account of evidence, whereas the Viewpoint presented one side of the ‘debate’, mainly using anecdotes. I doubt it would have been published if it had concluded that there was no negative impact of voice control devices.  

Opinion pieces are part of a very wide range of content that is selected for press release from among BMJ's portfolio of journals. They are subject to internal review in line with BMJ journals´overall editorial policy: the process (intentionally) doesn't discriminate against authors who don't have a strong research track record in a particular field  

I’ve been checking up on how frequently ADC promotes an article for press release. This information can be obtained here. This year, they have published 219 papers, of which three other articles have merited a press release: an analysis of survey data on weight loss (July), a research definition of Long Covid in children (February) and a data-based analysis of promotional claims about baby food (February). Many of the papers that were not press-released are highly topical and of general interest – a quick scan found papers on vaping, monkey pox, transgender adolescents, unaccompanied minors as asylum seekers, as well as many papers relating to Covid. It’s frankly baffling why a weakly evidenced viewpoint on a topic with little evidence was selected as meriting special treatment with a press release. 

As for the press release pathway itself, all potential pieces are sent out under embargo, irrespective of article type. This maximises the chances of balanced coverage: an embargo period enables journalists to contact the authors with any queries and to contact other relevant parties for comment. 

My wording may have been clumsy here and led to misunderstanding. My concern was more with the fact that the paper was press-released, which is, as established above, highly unusual, rather than with the embargo.  

The press release clearly stated (3 times) this article was a viewpoint and not new research, and that it hadn't been externally peer reviewed. We also always include a direct URL link to the article in question in our press releases so that journalists can read the content in full for themselves. 

I agree that the press release included these details, and indeed, had journalists consulted the Science Media Centre’s commentaries, the lack of peer review and data would have been evident. But nevertheless, it’s well-known that (a) journalists seldom read original sources, and (b) some of the less reputable newspapers are looking for  clickbait, so why provide them with the opportunity for sensationalising journal content?

While we do all we can to ensure that journalists cover our content responsibly, we aren't responsible for the manner in which they choose to do so. 

I agree that part of the blame for the media coverage lies with journalists. But I think the journal must bear some responsibility for the media uptake of the article. It’s a reasonable assumption that if a reputable journal issues a press release, it’s because the article in question is important and provides novel information from recognised experts in the field. It is unfortunate that that assumption was not justified in that case. 

I just checked to see how far the media interest in the story had developed. The Guardian, confronted with criticism, changed the lede to say “Researchers suggest”, rather than “New research says”, but the genie was well out of the bottle by that time. The paper has an Altmetric ‘attention’ score of 1577, and been picked up by 209 news outlets. There’s no indication that the article has “stimulated debate”. Rather it has been interpreted as providing a warning about a new danger facing children. The headlines, which can be found here, are variants of: 

 “Alexa and Siri make children rude” 

“Siri, Alexa and Google Home could hinder children’s social and cognitive development” 

“Voice-control devices may have an impact on children’s social, emotional development: Study” 

“According to a study, voice-controlled electronic aides can impair children’s development” 

“Experts warn that AI assistants affect children’s social development” 

“Experts warn AI assistants affect social growth of children” 

“Why Alexa and Siri may damage kids’ social and emotional development” 

“Voice assistants harmful for your child’s development, claims study” 

“Alexa, Siri, and Other Voice Assistants could negatively rewire your child’s brain” 

“Experts warn using Alexa and Siri may be bad for children” 

“Parents issued stark warning over kids using Amazon’s Alexa” 

“Are Alexa and Siri making our children DUMB?” 

“Use of voice-controlled devices ‘might have long-term consequences for children’” 

And most alarmingly, from the Sun: 

“Urgent Amazon Alexa warning for ALL parents as new danger revealed” 

Maybe the journal’s press office regards that as a success. I think it’s a disaster for the journal’s reputation as a serious academic journal. 

 

*Not the sleuth Nick Brown. Another one. 

Friday, 30 September 2022

Reviewer-finding algorithms: the dangers for peer review

 


Last week many words were written for Peer Review Week, so you might wonder whether there is anything left to say. I may have missed something, but I think I do have a novel take on this, namely to point out that some recent developments in automation may be making journals vulnerable to fake peer review. 

Finding peer reviewers is notoriously difficult these days. Editors are confronted with a barrage of submissions, many outside their area. They can ask authors to recommend peer reviewers, but this raises concerns of malpractice, if authors recommend friends, or even individuals tied up with paper mills, who might write a positive review in return for payment.

One way forward is to harness the power of big data to identify researchers who have a track record of publishing in a given area. Many publishers now use such systems. This way a journal editor can select from a database of potential reviewers that is formed by identifying papers with some overlap to a given submission.

I have become increasingly concerned, however, that use of algorithmically-based systems might leave a journal vulnerable to fraudulent peer reviewers who have accumulated publications by using paper mills. I became interested in this when submitting work to Wellcome Open Research and F1000, where open peer review is used, but it is the author rather than an editor who selects reviewers. Clearly, with such a system, one needs to be careful to avoid malpractice, and strict criteria are imposed. As explained here,  reviewers need to be:
  1. Qualified: typically hold a doctorate (PhD/MD/MBBS or equivalent). 
  2. Expert: have published at least three articles as lead author in a relevant topic, with at least one article having been published in the last five years. 
  3. Impartial: No competing interests and no co-authorship or institutional overlap with current authors. 
  4. Global: geographically diverse and from different institutions. 
  5. Diverse: in terms of gender, geographic location and career stage

Unfortunately, now that we have paper mills, which allow authors, for a fee, to generate and publish a large number of fake papers, these criteria are inadequate. Consider the case of Mohammed Sahab Uddin, who features in this account in Retraction Watch. As far as I am aware, he does not have a doctorate*, but I suspect people would be unlikely to query the qualifications of someone who had 137 publications and an H-index of 37. By the criteria above, he would be welcomed as a reviewer from an underrepresented location. And indeed, he was frequently used as a reviewer: Leslie McIntosh, who unmasked Uddin’s deception, noted that before he wiped his Publons profile, he had been listed as a reviewer on 300 papers. 

This is not an isolated case. We are only now beginning to get to grips with the scale of the problem of paper mills. There are undoubtedly many other cases of individuals who are treated as trusted reviewers on the back of fraudulent publications. Once in positions of influence, they can further distort the publication process. As I noted in last week's blogpost, open peer review offers a degree of defence against this kind of malpractice, as readers will at least be able to evaluate the peer review, but it is disturbing to consider how many dubious authors will have already found themselve promoted to positions of influence based on their apparently impressive track record of publishing, reviewing and even editing.

I started to think about this might interact with other moves to embrace artificial intelligence. A recent piece in Times Higher Education stated: “Research England has commissioned a study of whether artificial intelligence could be used to predict the quality of research outputs based on analysis of journal abstracts, in a move that could potentially remove the need for peer review from the Research Excellence Framework (REF).” This seems to me to be the natural endpoint of the move away from trusting the human brain in the publication process. We could end up with a system where algorithms write the papers, which are attributed to fake authors,  peer reviewed by fake peer reviewers, and ultimately evaluated in the Research Excellence Framework by machines. Such a system is likely to be far more successful than mere mortals, as it will be able to rapidly and flexibly adapt to changing evaluation criteria. At that point, we will have dispensed with the need for human academics altogether and have reached peak academia. 

 *Correction 30/9/22: Leslie McIntosh tells me he does have a doctorate and was working on a postdoc.

Sunday, 11 September 2022

So do we need editors?

It’s been an interesting week in world politics, and I’ve been distracting myself by pondering the role of academic editors. The week kicked off with a rejection of a preprint written with co-author Anna Abalkina, who is an expert sleuth who tracks down academic paper mills – organisations that will sell you a fake publication in an academic journal. Our paper describes a paper mill that had placed six papers in the Journal of Community Psychology, a journal which celebrated its 50th anniversary in 2021. We had expected rejection, as we submitted the paper to the Journal of Community Psychology, as a kind of stress test to see whether the editor, Michael B. Blank, actually reads papers that he accepts for the journal. I had started to wonder, because you can read his decision letters on Publons, and they are identical for every article he accepts. I suspected he may be an instance of Editoris Machina, or automaton, one who just delegates editorial work to an underling, waits until reviewer reports converge on a recommendation, and then accepts or rejects accordingly without actually reading the paper. I was wrong, though. He did read our paper, and rejected it with the comment that it was a superficial analysis of six papers. We immediately posted it as a preprint and plan to publish it elsewhere.

Although I was quite amused by all of this, it has a serious side. As we note in our preprint, when paper mills succeed in breaching the defences of a journal, this is not a victimless crime. First, it gives competitive advantage to the authors who paid the paper mill – they do this in order to have a respectable-looking publication that will help their career. I used to think this was a minor benefit, but when you consider that the paper mills can also ensure that the papers they place are heavily cited, you start to realise that authors can edge ahead on conventional indicators of academic prestige, while their more honest peers trail behind. The second set of victims are those who publish in the journal in good faith. Once its reputation is damaged by the evidence that there is no quality control, then all papers appearing in the journal are tainted by association. The third set of victims are busy academics who are trying to read and integrate the literature, who can get tangled up in the weeds as they try to navigate between useful and useless information. And finally, we need to be concerned about the cynicism induces in the general public when they realise that for some authors and editors, the whole business of academic publishing is a game, which is won not by doing good science, but by paying someone to pretend you have done so.

Earlier this week I shared my thoughts on the importance of ensuring that we have some kind of quality control over journal editors. They are, after all, the gatekeepers of science. When I wrote my taxonomy of journal editors back in 2010, I was already concerned at the times I had to deal with editors who were lazy or superficial in their responses to authors. I had not experienced ‘hands off’ editors in the early days of my research career, and I wondered how far this was a systematic change over time, or whether it was related to subject area. In the 1970s and 1980s, I mostly published in journals that dealt with psychology and/or language, and the editors were almost always heavily engaged with the paper, adding their own comments and suggesting how reviewer comments might be addressed. That’s how I understood the job when I myself was an editor. But when I moved to publishing work in journals that were more biological (genetics, neuroscience) things seemed different, and it was not uncommon to find editors who really did nothing more than collate peer reviews.

The next change I experienced was when, as a Wellcome-funded researcher, I started to publish in Wellcome Open Research (WOR), which adopts a very different publication model, based on that initiated by F1000. In this model, there is no academic editor. Instead, the journal employs staff who check that the paper complies with rigorous criteria: the proposed peer reviewers much have a track record of publishing and be clear from conflict of interest. Data and other materials must be openly available so that the work can be reproduced. And the peer review is published. The work is listed on PubMed if and when peer reviewers agree that it meets a quality threshold: otherwise the work remains visible but with status shown as not approved by peer review. 

The F1000/WOR model shows that editors are not needed, but I generally prefer to publish in journals that do have academic editors – provided the editor is engaged and does their job properly. My papers have benefitted from input from a wise and experienced editor on many occasions. In a specialist journal, such an editor will also know who are the best reviewers – those who have the expertise to give a detailed and fair appraisal of the work. However, in the absence of an engaged editor, I prefer the F1000/WOR model, where at least everything is transparent. The worst of all possible worlds is when you have an editor who doesn’t do more than collate peer reviews, but where everything is hidden: the outside world cannot know who the editor was, how decisions were made, who did the reviews, and what they said. Sadly, this latter situation seems to be pretty common, especially in the more biological realms of science. To test my intuitions, I ran a little Twitter poll for different disciplines, asking, for instance: 

 
 
 Results are below

% respondents stating Not Read, Read Superfially, or Read in Depth



 

Such polls of course have to be taken with a pinch of salt, as the respondents are self-selected, and the poll allows only very brief questions with no nuance. It is clear that within any one discipline, there is wide variability in editorial engagement. Nevertheless, I find it a matter of concern that in all areas, some respondents had experienced a journal editor who did not appear to have read the paper they had accepted, and in areas of biomedicine, neuroscience, and genetics, and also in mega journals, this was as high as around 25-33%

So my conclusion is that it is not surprising that we are seeing phenomena like paper mills, because the gatekeepers of the publication process are not doing their job. The solution would be either to change the culture for editors, or, where that is not feasible, to accept that we can do without editors. But if we go down that route, we should move to a model such as F1000 with much greater quality control over reviewers and COI, and much more openness and transparency.

 As usual comments are welcome: if you have trouble getting past comment moderation, please email me.

Tuesday, 6 September 2022

We need to talk about editors


Editoris spivia

The role of journal editor is powerful: you decide what is accepted or rejected for publication. Given that publications count as an academic currency – indeed in some institutions they are literally fungible – a key requirement for editors is that they are people of the utmost integrity. Unfortunately, there are few mechanisms in place to ensure editors are honest – and indeed there is mounting evidence that many are not. I argue here that we can no longer take editorial honesty for granted, and systems need to change to weed out dodgy editors if academic publishing is to survive as a useful way of advancing science. In particular, the phenomenon of paper mills has shone a spotlight on editorial malpractice.

Questionable editorial practices

Back in 2010, I described a taxonomy of journal editors based on my own experience as an author over the years. Some were negligent, others were lordly, and others were paragons – the kind of editor we all want, who is motivated solely by a desire for academic excellence, who uses fair criteria to select which papers are published, who aims to help an author improve their work, and provides feedback in a timely and considerate fashion. My categorisation omitted another variety of editor that I have sadly become acquainted with in the intervening years: the spiv. The spiv has limited interest in academic excellence: he or she sees the role of editor as an opportunity for self-advancement. This usually involves promoting the careers of self or friends by facilitating publication of their papers, often with minimal reviewing, and in some cases may go as far as working hand in glove with paper mills to receive financial rewards for placing fraudulent papers.

When I first discovered a publication ring that involved journal editors scratching one another’s backs, in the form of rapid publication of each other’s papers, I assumed this was a rare phenomenon. After I blogged about this, one of the central editors was replaced, but others remained in post. 

I subsequently found journals where the editor-in-chief authored an unusually high percentage of the articles published in the journal. I drew these to the attention of integrity advisors of the publishers that were involved, but did not get the impression that they regarded this as particularly problematic or were going to take any action about it. Interestingly, there was one editor, George Marcoulides, who featured twice in a list of editors who authored at least 15 articles in their own journal over a five year period. Further evidence that he equates his editorial role with omnipotence came when his name cropped up in connection with a scandal where a reviewer, Fiona Fidler, complained after she found her positive report on a paper had been modified by the editor to justify rejecting the paper: see this Twitter thread for details. It appears that the publishers regard this as acceptable: Marcoulides is still editor-in-chief at the Sage journal Educational and Psychological Measurement, and at Taylor and Francis’ Structural Equation Modeling, though his rate of publishing in both journals has declined since 2019; maybe someone had a word with him to explain that publishing most of your papers in a journal you edit is not a good look.

Scanff et al (2021) did a much bigger investigation of what they termed “self-promotion journals” - those that seemed to be treated as the personal fiefdom of editors, who would use the journal as an outlet for their own work. This followed on from a study by Locher et al (2021), which found editors who were ready to accept papers by a favoured group of colleagues with relatively little scrutiny. This had serious consequences when low-quality studies relating to the Covid-19 pandemic appeared in the literature and subsequently influenced clinical decisions. Editorial laxness appears in this case to have done real harm to public health.

So, it's doubtful that all editors are paragons. And this is hardly surprising: doing a good job as editor is hard and often thankless work. On the positive side, an editor may obtain kudos for being granted an influential academic role, but often there is little or no financial reimbursement for the many hours that must be dedicated to reading and evaluating papers, assigning reviewers, and dealing with fallout from authors who react poorly to having their papers rejected. Even if an editor starts off well, they may over time start to think “What’s in this for me?” and decide to exploit the opportunities for self-advancement offered by the position. The problem is that there seems little pressure to keep them on the straight and narrow; it's like when a police chief is corrupt. Nobody is there to hold them to account. 

Paper mills

Many people are shocked when they read about the phenomenon of academic paper mills – defined in a recent report by the Committee on Publication Ethics (COPE) and the Association of Scientific, Tehcnical and Medical Publishers (STM) as “the process by which manufactured manuscripts are submitted to a journal for a fee on behalf of researchers with the purpose of providing an easy publication for them, or to offer authorship for sale.” The report stated that “the submission of suspected fake research papers, also often associated with fake authorship, is growing and threatens to overwhelm the editorial processes of a significant number of journals.” It concluded with a raft of recommendations to tackle the problem from different fronts: changing the incentives adopted by institutions, investment in tools to detect paper mill publications, education of editors and reviewers to make them aware of paper mills, introduction of protocols to impede paper mills succeeding, and speeding up the process of retraction by publishers.

However, no attention was given to the possibility that journal editors may contribute to the problem: there is talk of “educating” them to be more aware of paper mills, but this is not going to be effective if the editor is complicit with the paper mill, or so disengaged from editing as to not care about them. 

It’s important to realise that not all paper mill papers are the same. Many generate outputs that look plausible. As Byrne and Labbé (2017) noted, in biomedical genetic studies, fake papers are generated from a template that is based on a legitimate paper, and just vary in terms of the specific genetic sequence and/or phenotype that is studied. There are so many genetic sequences and phenotypes, that the number of possible combinations of these is immense. In such cases, a diligent editor may get tricked into accepting a fake paper, because the signs of fakery are not obvious and aren’t detected by reviewers. But at the other extreme, some products of paper mills are clearly fabricated. The most striking examples are those that contain what Guillaume Cabanac and colleagues term “tortured phrases”. These appear to be generated by taking segments of genuine articles and running them through an AI app that will use a thesaurus to alter words, with the goal of evading plagiarism detection software. In other cases, the starting point appears to be text from an essay mill. The results are often bizarre and so incomprehensible that one only needs read a few sentences to know that something is very wrong. Here’s an example from Elsevier’s International Journal of Biological Macromolecules, which those without access can pay $31.50 for (see analysis on Pubpeer, here).

"Wound recuperating camwood a chance to be postponed due to the antibacterial reliance of microorganisms concerning illustration an outcome about the infection, wounds are unable to mend appropriately, furthermore, take off disfiguring scares [150]. Chitin and its derivatives go about as simulated skin matrixes that are skilled should push a fast dermal redesign after constantly utilized for blaze treatments, chitosan may be wanton toward endogenous enzymes this may be a fundamental preference as evacuating those wound dressing camwood foundation trauma of the wounds and harm [151]. Chitin and its derivatives would make a perfect gas dressing. Likewise, they dampen the wound interface, are penetrability will oxygen, furthermore, permit vaporous exchange, go about as a boundary with microorganisms, and are fit about eliminating abundance secretions"

And here’s the start of an Abstract from a Springer Nature collection called Modern Approaches in Machine Learning and Cognitive Science (see here for some of the tortured phrases that led to detection of this article). The article can be yours for £19.95:

“Asthma disease are the scatters, gives that influence the lungs, the organs that let us to inhale and it’s the principal visit disease overall particularly in India. During this work, the matter of lung maladies simply like the trouble experienced while arranging the sickness in radiography are frequently illuminated. There are various procedures found in writing for recognition of asthma infection identification. A few agents have contributed their realities for Asthma illness expectation. The need for distinguishing asthma illness at a beginning period is very fundamental and is an exuberant research territory inside the field of clinical picture preparing. For this, we’ve survey numerous relapse models, k-implies bunching, various leveled calculation, characterizations and profound learning methods to search out best classifier for lung illness identification. These papers generally settlement about winning carcinoma discovery methods that are reachable inside the writing.”

These examples are so peculiar that even a layperson could detect the problem. In more technical fields, the fake paper may look superficially normal, but is easy to spot by anyone who knows the area, and who recognises that the term “signal to noise” does not mean “flag to commotion”, or that while there is such a thing as a “Swiss albino mouse” there is no such thing as a “Swiss pale-skinned person mouse”. These errors are not explicable as failures of translation by someone who does not speak good English. They would be detected by any reviewer with expertise in the field. Another characteristic of paper mill outputs, featured in this recent blogpost, are fake papers that combine tables and figures from different publications in nonsensical contexts.

Sleuths who are interested in unmasking paper mills have developed automated methods for identifying such papers, and the number is both depressing and astounding. As we have seen, though some of these outputs appear in obscure sources, many crop up in journals or edited collections that are handled by the big scientific publishing houses, such as Springer Nature, Elsevier and Wiley. When sleuths find these cases, they report the problems on the website PubPeer, and this typically raises an incredulous response as to how on earth did this material get published. It’s a very good question, and the answer has to be that somehow an editor let this material through. As explained in the COPE&STM report, sometimes a nefarious individual from a paper mill persuades a journal to publish a “special issue” and the unwitting journal is then hijacked and turned into a vehicle for publishing fraudulent work. If the special issue editor poses as a reputable scientist, using a fake email address that looks similar to the real thing, this can be hard to spot.

But in other cases, we see clearcut instances of paper mill outputs that have apparently been approved by a regular journal editor. In a recent preprint, Anna Abalkina and I describe finding putative paper mill outputs in a well-established Wiley journal, the Journal of Community Psychology. Anna identified six papers in the journal in the course of a much larger investigation of papers that came from faked email addresses. For five of them the peer review and editorial correspondence was available on Publons. The papers,  from addresses in Russia or Kazakhstan, were of very low quality and frequently opaque. I had to read and re-read to work out what the paper was about, and still ended up uncertain. The reviewers, however, suggested only minor corrections. They used remarkably similar language to one another, giving the impression that the peer review process had been compromised. Yet the Editor-in-Chief, Michael B. Blank, accepted the papers after minor revisions, with a letter concluding: “Thank you for your fine contribution”. 

There are two hypotheses to consider when a journal publishes incomprehensible or trivial material: either the editor was not doing their job of scrutinising material in the journal, or they were in cahoots with a paper mill. I wondered whether the editor was what I have previously termed an automaton – one who just delegates all the work to a secretary. After all, authors are asked to recommend reviewers, so all that is needed is for someone to send out automated requests to review, and then keep going until there are sufficient recommendations to either accept or reject. If that were the case, then maybe the journal would accept a paper by us. Accordingly, we submitted our manuscript about paper mills to the Journal of Community Psychology. But it was desk rejected by the Editor in Chief with a terse comment: “This a weak paper based on a cursory review of six publications”. So we can reject hypothesis 1 – that the editor is an automaton. But that leaves hypothesis 2 – that the editor does read papers submitted to his journal, and had accepted the previous paper mill outputs in full knowledge of their content. This raises more questions than it answers. In particular, why would he risk his personal reputation and that of his journal by behaving that way? But perhaps rather than dwelling on that question, we should think positively about how journals might protect themselves in future from attacks by paper mills.

A call for action

My proposal is that, in addition to the useful suggestions from the COPE&STM report, we need additional steps to ensure that those with editorial responsibility are legitimate and are doing their job. Here are some preliminary suggestions:

  1. Appointment to the post of editor should be made in open competition among academics who meet specified criteria.
  2. It should be transparent who is responsible for final sign-off for each article that is published in the journal.
  3. Journals where a single editor makes the bulk of editorial decisions should be discouraged. (N.B. I looked at the 20 most recent papers in Journal of Community Psychology that featured on Publons and all had been processed by Michael B. Blank).
  4. There should be an editorial board consisting of reputable people from a wide range of institutional backgrounds, who share the editorial load, and meet regularly to consider how the journal is progressing and to discuss journal business.
  5.  Editors should be warned about the dangers of special issues and should not delegate responsibility for signing off on any papers appearing in a special issue.
  6. Editors should be required to follow COPE guidelines about publishing in their own journal, and publishers should scrutinise the journal annually to check whether the recommended procedures were followed.
  7. Any editor who allows gibberish to be published in their journal should be relieved of their editorial position immediately.

Many journals run by academic societies already adopt procedures similar to these. Particular problems arise when publishers start up new journals to fill a perceived gap in the market, and there is no oversight by academics with expertise in the area. The COPE&STM report has illustrated how dangerous that can be – both for scientific progress and for the reputation of publishers.

Of course, when one looks at this list of requirements, one may start to wonder why anyone would want to be an editor. Typically there is little financial renumeration, and the work is often done in a person’s “spare time”. So maybe we need to rethink how that works, so that paragons with a genuine ability and interest in editing are rewarded more adequately for the important work they do.

P.S. Comment moderation is enabled for this blog to prevent it being overwhelmed by spam, but I welcome comments, and will check for these in the weeks following the post, and admit those that are on topic. 

 

Comment by Jennifer Byrne, 9th Sept 2022 

(this comment by email, as Blogger seems to eat comments by Jennifer for some reason, while letting through weird spammy things!).

This is a fantastic list of suggestions to improve the important contributions of journal editors. I would add that journal editors should be appointed for defined time periods, and their contributions regularly reviewed. If for any reason it becomes apparent that an editor is not in a position to actively contribute to the journal, they should be asked to step aside. In my experience, editorial boards can include numerous inactive editors. These can provide the appearance of a large, active and diverse editorial board, when in practice, the editorial work may be conducted by a much smaller group, or even one person. Journals cannot be run successfully without a strong editorial team, but such teams require time and resources to establish and maintain.

Tuesday, 9 August 2022

Can systematic reviews help clean up science?

 

The systematic review was not turning out as Lorna had expected

Why do people take the risk of publishing fraudulent papers, when it is easy to detect the fraud? One answer is that they don’t expect to be caught. A consequence of the growth in systematic reviews is that this assumption may no longer be safe. 

In June I participated in a symposium organised by the LMU Open Science Center in Munich entitled “How paper mills publish fake science industrial-style – is there really a problem and how does it work?” The presentations are available here. I focused on the weird phenomenon of papers containing “tortured phrases”, briefly reviewed here. For a fuller account see here. These are fakes that are easy to detect, because, in the course of trying to circumvent plagiarism detection software, they change words, with often unintentionally hilarious consequences. For instance, “breast cancer” becomes “bosom peril” and “random value” becomes “irregular esteem”. Most of these papers make no sense at all – they may include recycled figures from other papers. They are typically highly technical and so to someone without expertise in the area they may seem valid, but anyone familiar with the area will realise that someone who writes “flag to commotion” instead of “signal to noise” is a hoaxer. 

Speakers at the symposium drew attention to other kinds of paper mill whose output is less conspicuously weird. Jennifer Byrne documented industrial-scale research fraud in papers on single gene analyses that were created by templates, and which purported to provide data on under-studied genes in human cancer models. Even an expert in the field may be hoodwinked by these. I addressed the question of “does it matter?” For the nonsense papers generated using tortured phrases, it could be argued that it doesn’t, because nobody will try to build on that research. But there are still victims: authors of these fraudulent papers may outcompete other, honest scientists for jobs and promotion, journals and publishers will suffer reputational damage, and public trust in science is harmed. But what intrigued me was that the authors of these papers may also be regarded as victims, because they will have on public record a paper that is evidently fraudulent. It seems that either they are unaware of just how crazy the paper appears, or that they assume nobody will read it anyway. 

The latter assumption may have been true a couple of decades ago, but with the growth of systematic reviews, researchers are scrutinizing many papers that previously would have been ignored. I was chatting with John Loadsman, who in his role as editor of Anaesthesia and Intensive Care has uncovered numerous cases of fraud. He observed that many paper mill outputs never get read because, just on the basis of the title or abstract, they appear trivial or uninteresting. However, when you do a systematic review, you are supposed to read everything relevant to the research question, and evaluate it, so these odd papers may come to light. 

I’ve previously blogged about the importance of systematic reviews for avoiding cherrypicking of the literature. Of course, evaluation of papers is often done poorly or not at all, in which case the fraudulent papers just pollute the literature when added to a meta-analysis. But I’m intrigued at the idea that systematic reviews might also serve the purpose of putting the spotlight on dodgy science in general, and fraudsters in particular, by forcing us to read things thoroughly. I therefore asked Twitter for examples – I asked specifically about meta-analysis but the responses covered systematic reviews more broadly, and were wide-ranging both in the types of issue that were uncovered and the subject areas. 

Twitter did not disappoint: I received numerous examples – more than I can include here. Much of what was described did not sound like the work of paper mills, but did include fraudulent data manipulation, plagiarism, duplication of data in different papers, and analytic errors. Here are some examples: 

Paper mills and template papers

Jennifer Byrne noted how she became aware of paper mills when looking for studies of a particular gene she was interested in, which was generally under-researched. Two things raised her suspicions: a sudden spike in studies of the gene, plus series of papers that had the same structure, as if constructed from a template. Subsequently, with Cyril Labbé, who developed an automated Seek & Blastn tool to assess nucleotide sequences, she found numerous errors in the reagents and specification of genetic sequences of these repetitive papers, and it became clear that they were fraudulent. 

An example of a systematic review that discovered a startling level of inadequate and possibly fraudulent research was focused on the effect of tranexamic acid on post-partum haemorrhage: out of 26 reports, eight had sections of identical or very similar text, despite apparently coming from different trials. This is similar to what has been described for papers from paper mills, which are constructed from a template. And, as might be expected for a paper mill output, there were also numerous statistical and methodological errors, and some cases without ethical approval. (Thanks to @jd_wilko for pointing me to this example). 

Plagiarism 

Back in 2006, Iain Chalmers, who is generally ahead of his time, noted that systematic reviews could root out cases of plagiarism, citing the example of Asim Kurjak, whose paper on epidural analgesia in labour was heavily plagiarised

Data duplication 

Meta-analysis can throw up cases where the same study is reported in two or more papers, with no indication that this is the same data. Although this might seem like a minor problem compared with fraud, it can be serious, because if the duplication is missed in a meta-analysis, that study will be given more weight than it should have. Ioana Cristea noted that such ‘zombie papers’ have cropped up in a meta-analysis she is currently analysing. 

Tampering with peer review 

When a paper considered for a meta-analysis seems dubious, it raises the question of whether proper peer review procedures were followed. It helps if the journal adopts open peer review. Robin N. Kok reported a paper where the same person was listed as an author and a peer reviewer. This was eventually retracted.  

Data seem too good to be true 

This piece in Science tells the story of Qian Zhang, who published a series of studies on impact of cartoon violence in children which on the one hand had remarkably large samples of children all at the same age, and on the other hand had similar samples across apparently different studies.  Because of their enormous size, Zhang’s papers distorted any meta-analysis they were included in. 

Aaron Charlton cited another case, where serious anomalies were picked up in a study on marketing in the course of a meta-analysis. The paper was ultimately retracted 3 years after the concerns were raised, after defensive responses from some of the authors, challenging the meta-analysts. 

This case flagged by Neil O’Connell is especially useful, as it documents a range of methods used to evaluate suspect research. The dodgy work was first flagged up in a meta-analysis of cognitive behaviour therapy for chronic pain.  Three papers with the same lead author, M. Monticone, obtained results that were discrepant with the rest of the literature, with much bigger effect sizes. The meta-analysts then looked at other trials by the same team and found that there was a 6-fold difference between the lower confidence interval of the Monticone studies and the upper confidence interval of all others combined. The paper also reports email exchanges with Dr Monticone that may be of interest to readers. 

Poor methodology 

Fiona Ramage told me that in the course of doing a preclinical systematic review and meta-analysis of nutritional neuroscience, she encountered numerous errors of basic methodology and statistics, e.g. dozens of papers where error bars were presented without indicating if they show SE or SD; studies claiming differences between groups without a direct statistical comparison. This is more likely to be due to ignorance or honest error than to malpractice, but it needs to be flagged up so that the literature is not polluted by erroneous data.

What are the consequences?

Of course, the potential of systematic reviews to detect bad science is only realised if the dodgy papers are indeed weeded out of the literature, and people who commit scientific fraud are fired. Journals and publishers have started to respond to paper mills, but, as Ivan Oransky has commented, this is a game of Whac-a-Mole, and "the process of retracting a paper remains comically clumsy, slow and opaque”. 

I was surprised that even when confronted with an obvious case of a paper that had both numerous tortured phrases and plagiarism, the response from the publisher was slow – e.g. this comically worded example is still not retracted, even though the publisher’s research integrity office acknowledged my email expressing concern over 2 months ago.  But 2 months is nothing. Guillaume Cabanac recently tweeted about a "barn door" case of plagiarism that has just been retracted 20 years after it was first flagged up.  When I discuss the slow responses to concerns with publishers, they invariably say that they are being kept very busy with a huge volume of material from paper mills. To which I answer, you are making immense profits, so perhaps some could be channeled into employing more people to tackle this problem. As I am fond of pointing out, I regard a publisher who leaves seriously problematic studies in the literature as analogous to a restauranteur that serves poisoned food to customers. 

Publishers may be responsible for correcting the scientific record, but it is institutional employers who need to deal with those who commit malpractice. Many institutions don’t seem to take fraud seriously. This point was made back in 2006 by Iain Chalmers, who described the lenient treatment of Asim Kurjak, and argued for public naming and shaming of those who are found guilty of scientific misconduct. Unfortunately, there’s not much evidence that his advice has been heeded. Consider this recent example of a director of a primate reseach lab who admitted fraud, but is still in post. (Here the fraud was highlighted by a whistleblower rather than a systematic review, but this illustrates the difficulty of tackling fraud when there are only minor consequences for fraudsters). 

Could a move towards "slow science" help? In the humanities, literary scholars pride themselves on “close reading” of texts. In science, we are often so focused on speed and concision, that we tend to lose the ability to focus deeply on a text, especially if it is boring. The practice of doing a systematic review should in principle develop better skills in evaluation of individual papers, and in so doing help cleanse the literature from papers that should never have got published in the first place. John Loadsman has suggested we should not only read papers carefully, but should recalibrate ourselves to have a very high “index of suspicion” rather than embracing the default assumption that everyone is honest. 

P.S

Many thanks to everyone who sent in examples. Sorry I could not include everything. Please feel free to add other examples or reactions in the Comments – these tend to get overwhelmed with adverts for penis enlargement or (ironically) essay mills, and so are moderated, but I do check them and relevant comments will eventually appear.

PPS. Florian Naudet sent a couple of relevant links that readers might enjoy: 

Fascinating article by Fanelli et al who looked at how inclusion of retracted papers affected meta-analyses: https://www.tandfonline.com/doi/full/10.1080/08989621.2021.1947810  

And this piece by Lawrence et al shows the dangers of meta-analyses when there is insufficient scrutiny of the papers that are included: https://www.nature.com/articles/s41591-021-01535-y  

Also, Joseph Lee tweeted about this paper about inclusion of papers from predatory publications in meta-analyses: https://jmla.pitt.edu/ojs/jmla/article/view/491 

PPPS. 11th August 2022

A couple of days after posting this, I received a copy of "Systematic Reviews in Health Research" edited by Egger, Higgins and Davey Smith. Needless to say, the first thing I did was to look up "fraud" in the index. Although there are only a couple of pages on this, the examples are striking. 

First, a study by Nowbar et al (2014) on bone marrow stem cells for heart disease found that in a review of 133 reports, over 600 discrepancies were found, and the number of discrepancies increased with the reported effect size. There's a trail of comments on Pubpeer relating to some of the sources, e.g. https://pubpeer.com/publications/B346354468C121A468D30FDA0E295E.

Another example concerns the use of beta-blockers during surgery. A series of studies from one centre (the DECREASE trials) showing good evidence of effectiveness was investigated and found to be inadequate, with missing data and failure to follow research protocols. When these studies were omitted from a meta-analysis, the conclusion was that, far from receiving benefit from beta-blockers, patients in the treatment group were more likely to die (Bouri et al, 2014). 

 PPPPS, 18th August 2022

This comment by Jennifer Byrne was blocked by Blogger - possibly because it contained weblinks.

Anyhow, here is what she said:

I agree, reading both widely and deeply can help to identify problematic papers, and an ideal time for this to happen is when authors are writing either narrative or systematic reviews. Here's another two examples where Prof Carlo Galli and colleagues identified similar papers that may have been based on templates: https://www.mdpi.com/2304-6775/7/4/67, https://link.springer.com/article/10.1007/s11192-022-04434-2 

 




Wednesday, 3 August 2022

Contagion of the political system


 

Citizens of the UK have in recent weeks watched in amazement as the current candidates for leadership of the Conservative party debate their policies. Whoever wins will replace Boris Johnson as Prime Minister, with the decision made by a few thousand members of the Conservative Party. All options were bad, and we are now down to the last two: Liz Truss and Rishi Sunak.

 

For those of us who are not Conservatives, and for many who are, there was immense joy at the ousting of Boris Johnson. The man seemed like a limpet, impossible to dislodge. Every week brought a new scandal that would have been more than sufficient to lead to resignation 10 years ago, yet he hung on and on. Many people thought that, after a vote of no confidence in his leadership, he would step down so that a caretaker PM could run the country while the debate over his successor took place, but the limpet is still clinging on. He’s not doing much running of the country, but that’s normal, and perhaps for the best. He’s much better at running parties than leading the Conservative party.

 

I have to say I had not expected much from Truss and Sunak, but even my low expectations have not been met. The country is facing immense challenges, from climate change, from coronavirus, and from escalating energy prices. These are barely mentioned: instead the focus is on reducing taxes, with the candidates now competing for just how much tax they can cut. As far as I can see, these policies will do nothing to help the poorest in society, whose benefits will shrink to pay for tax cuts; the richer you are the more tax you pay and so this is a rich person’s policy.

 

What has surprised me is just how ill-informed the two candidates are. The strategy seems to be to pick a niche topic of interest to Conservative voters, make up a new policy overnight and announce it the next day. So we have Rishi Sunak proposing that the solution to the crisis in the NHS is to charge people who miss doctor’s appointments. Has he thought this through? Think of the paperwork. Think of the debt collectors tasked with collecting £10 from a person with dementia. Think of the cost of all of this.  And on Education, his idea is to reintroduce selective (grammar) schools: presumably because he thinks that our regular schools are inadequate to educate intelligent children.

 

On Education, Liz Truss is even worse. Her idea is that all children who score top marks in their final year school examinations should get an interview to go to Oxford or Cambridge University. This is such a crazy idea that others have written at length to point out its flaws (e.g. this cogent analysis by Sam Freedman). Suffice it to say that it has a similar vibe to the Sunak grammar schools plan: it implies that only two British universities have any value. Conservatives do seem obsessed with creating divisions between haves and have-nots, but only if they can ensure their children are among the haves.

 

Another confused statement from Truss is that, as far as Scotland goes, she plans to ignore Nicola Sturgeon, the First Minister of Scotland and leader of the Scottish National Party. At a time when relationships between Scotland and England are particularly fraught, this insensitive statement is reminiscent of the gaffes of Boris Johnson.

 

Oh, and yesterday she also announced – and then quickly U-turned – an idea that would limit the pay of public sector workers in the North of England, because it was cheaper to live there.

 

What I find so odd about both Sunak and Truss is that they keep scoring own goals. Nobody requires them to keep coming up with new policies in niche areas.  Why don’t they just hold on to their original positions, and if asked about anything else, just agree to ‘look at’ it when in power? Johnson was always very good at promising to ‘look at’ things: when he’s not being a limpet, he’s a basilisk. The more you probe Sunak and Truss, the more their shallowness and lack of expertise show through. They’d do well to keep schtum. Or, better still, show some indication that they could, for instance, get a grip on the crisis in the NHS.

 

What all this demonstrates is how an incompetent and self-promoting leader causes damage far beyond their own term. Johnson’s cabinet was selected purely on one criterion: loyalty to him. The first requirement was to “believe in Brexit” – reminiscent of the historical wars between Protestants and Catholics, where the first thing you ask of a candidate is what their religion is. Among Conservative politicians, it seems that an accusation of not really being a Brexiteer is the worst thing you can say about a candidate. Indeed, that is exactly the charge that her opponents level against Truss, who made cogent arguments for remaining in the EU before the referendum. Like a Protestant required to recant their beliefs or face the flames, she is now reduced to defending Brexit in the strongest possible terms, saying that “predictions of doom have not come true”, as farmers, fishermen, and exporters go out of business, academics leave in droves, and holidaymakers sit in queues at Dover.

 

It's known that Johnson does not want to give up the top job. I’m starting to wonder if behind all of this is a cunning plan. The people he’s appointed to cabinet are so incompetent that maybe he hopes that, when confronted with a choice between them, the Conservative Party will decide that he looks better than either of them.

 

 

 

 

Wednesday, 29 June 2022

A proposal for data-sharing that discourages p-hacking

Open data is a great way of helping give confidence in the reproducibility of research findings. Although we are still a long way from having adequate implementation of data-sharing in psychology journals (see, for example, this commentary by Kathy Rastle, editor of Journal of Memory and Language), things are moving in the right direction, with an increasing number of journals and funders requiring sharing of data and code. But there is a downside, and I've been thinking about it this week, as we've just published a big paper on language lateralisation, where all the code and data are available on Open Science Framework. 

One problem is p-hacking. If you put a large and complex dataset in the public domain, anyone can download it and then run multiple unconstrained analyses until they find something, which is then retrospectively fitted to a plausible-sounding hypothesis. The potential to generate non-replicable false positives by such a process is extremely high - far higher than many scientists recognise. I illustrated this with a fictitious example here

Another problem is self-imposed publication bias: the researcher runs a whole set of analyses to test promising theories, but forgets about them as soon as they turn up a null result. With both of these processes in operation, data sharing becomes a poisoned chalice: instead of increasing scientific progress by encouraging novel analyses of existing data, it just means more unreliable dross is deposited in the literature. So how can we prevent this? 

In this Commentary paper, I noted several solutions. One is to require anyone accessing the data to submit a protocol which specifies the hypotheses and the analyses that will be used to test them. In effect this amounts to preregistration of secondary data analysis. This is the method used for some big epidemiological and medical databases. But it is cumbersome and also costly - you need the funding to support additional infrastructure for gatekeeping and registration. For many psychology projects, this is not going to be feasible. 

A simpler solution would be to split the data into two halves - those doing secondary data analysis only have access to part A, which allows them to do exploratory analyses, after which they can then see if any findings hold up in part B. Statistical power will be reduced by this approach, but with large datasets it may be high enough to detect effects of interest.  I wonder if it would be relatively easy to incorporate this option into Open Science Framework: i.e. someone who commits a preregistration of a secondary data analysis on the basis of exploratory analysis of half a dataset then receives a code that unlocks the second half of the data (the hold-out sample). A rough outline of how this might work is shown in Figure 1.

Figure 1: A possible flowchart for secondary data analysis on a platform such as OSF

An alternative that has been discussed by MacCoun and Perlmutter is blind analysis - "temporarily and judiciously removing data labels and altering data values to fight bias and error". The idea is that you can explore a dataset and run a planned analysis on it, but it won't be possible for the results to affect your analysis, because the data have been changed, so you won't know what is correct. A variant of this approach would be multiple datasets with shuffled data in all but one of them. The shuffling would be similar to what is done in permutation analysis - so there might be ten versions of the dataset deposited, with only one having the original unshuffled data. Those downloading the data would not know whether or not they had the correct version - only after they had decided on an analysis plan, would they be told which dataset it should be run on. 

I don't know if these methods would work, but I think they have potential for keeping people honest in secondary data analysis, while minimising bureaucracy and cost. On a platform such as Open Science Framework it is already possible to create a time-stamped preregistration of an analysis plan. I assume that within OSF there is already a log that indicates who has downloaded a dataset. So someone who wanted to do things right and just download one dataset (either a random half, or one of a set of shuffled datasets) would just need to have a mechanism that allowed them to gain access to the full, correct data after they had preregistered an analysis, similar to that outlined above.

These methods are not foolproof. Two researchers could collude - or one researcher could adopt multiple personas - so that they get to see the correct data as person X and then start a new process as person B, when they can preregister an analysis where results are already known. But my sense is that there are many honest researchers who would welcome this approach - precisely because it would keep them honest. Many of us enjoy exploring datasets, but it is all too easy to fool yourself into thinking that you've turned up something exciting when it is really just a fluke arising in the course of excessive data-mining. 

Like a lot of my blogposts, this is just a brain dump of an idea that is not fully thought through. I hope by sharing it, I will encourage people to come up with criticisms that I haven't thought of, or alternatives that might work better. Comments on the blog are moderated to prevent spam, but please do not be deterred - I will post any that are on topic. 

P.S. 5th July 2022 

Florian Naudet drew my attention to this v relevant paper: 

Baldwin, J. R., Pingault, J.-B., Schoeler, T., Sallis, H. M., & Munafò, M. R. (2022). Protecting against researcher bias in secondary data analysis: Challenges and potential solutions. European Journal of Epidemiology, 37(1), 1–10. https://doi.org/10.1007/s10654-021-00839-0

Saturday, 30 April 2022

Bishopblog catalogue (updated 30 April 2022)

Source: http://www.weblogcartoons.com/2008/11/23/ideas/

Those of you who follow this blog may have noticed a lack of thematic coherence. I write about whatever is exercising my mind at the time, which can range from technical aspects of statistics to the design of bathroom taps. I decided it might be helpful to introduce a bit of order into this chaotic melange, so here is a catalogue of posts by topic.

Language impairment, dyslexia and related disorders
The common childhood disorders that have been left out in the cold (1 Dec 2010) What's in a name? (18 Dec 2010) Neuroprognosis in dyslexia (22 Dec 2010) Where commercial and clinical interests collide: Auditory processing disorder (6 Mar 2011) Auditory processing disorder (30 Mar 2011) Special educational needs: will they be met by the Green paper proposals? (9 Apr 2011) Is poor parenting really to blame for children's school problems? (3 Jun 2011) Early intervention: what's not to like? (1 Sep 2011) Lies, damned lies and spin (15 Oct 2011) A message to the world (31 Oct 2011) Vitamins, genes and language (13 Nov 2011) Neuroscientific interventions for dyslexia: red flags (24 Feb 2012) Phonics screening: sense and sensibility (3 Apr 2012) What Chomsky doesn't get about child language (3 Sept 2012) Data from the phonics screen (1 Oct 2012) Auditory processing disorder: schisms and skirmishes (27 Oct 2012) High-impact journals (Action video games and dyslexia: critique) (10 Mar 2013) Overhyped genetic findings: the case of dyslexia (16 Jun 2013) The arcuate fasciculus and word learning (11 Aug 2013) Changing children's brains (17 Aug 2013) Raising awareness of language learning impairments (26 Sep 2013) Good and bad news on the phonics screen (5 Oct 2013) What is educational neuroscience? (25 Jan 2014) Parent talk and child language (17 Feb 2014) My thoughts on the dyslexia debate (20 Mar 2014) Labels for unexplained language difficulties in children (23 Aug 2014) International reading comparisons: Is England really do so poorly? (14 Sep 2014) Our early assessments of schoolchildren are misleading and damaging (4 May 2015) Opportunity cost: a new red flag for evaluating interventions (30 Aug 2015) The STEP Physical Literacy programme: have we been here before? (2 Jul 2017) Prisons, developmental language disorder, and base rates (3 Nov 2017) Reproducibility and phonics: necessary but not sufficient (27 Nov 2017) Developmental language disorder: the need for a clinically relevant definition (9 Jun 2018) Changing terminology for children's language disorders (23 Feb 2020) Developmental Language Disorder (DLD) in relaton to DSM5 (29 Feb 2020) Why I am not engaging with the Reading Wars (30 Jan 2022)

Autism
Autism diagnosis in cultural context (16 May 2011) Are our ‘gold standard’ autism diagnostic instruments fit for purpose? (30 May 2011) How common is autism? (7 Jun 2011) Autism and hypersystematising parents (21 Jun 2011) An open letter to Baroness Susan Greenfield (4 Aug 2011) Susan Greenfield and autistic spectrum disorder: was she misrepresented? (12 Aug 2011) Psychoanalytic treatment for autism: Interviews with French analysts (23 Jan 2012) The ‘autism epidemic’ and diagnostic substitution (4 Jun 2012) How wishful thinking is damaging Peta's cause (9 June 2014) NeuroPointDX's blood test for Autism Spectrum Disorder ( 12 Jan 2019)

Developmental disorders/paediatrics
The hidden cost of neglected tropical diseases (25 Nov 2010) The National Children's Study: a view from across the pond (25 Jun 2011) The kids are all right in daycare (14 Sep 2011) Moderate drinking in pregnancy: toxic or benign? (21 Nov 2012) Changing the landscape of psychiatric research (11 May 2014) The sinister side of French psychoanalysis revealed (15 Oct 2019)

Genetics
Where does the myth of a gene for things like intelligence come from? (9 Sep 2010) Genes for optimism, dyslexia and obesity and other mythical beasts (10 Sep 2010) The X and Y of sex differences (11 May 2011) Review of How Genes Influence Behaviour (5 Jun 2011) Getting genetic effect sizes in perspective (20 Apr 2012) Moderate drinking in pregnancy: toxic or benign? (21 Nov 2012) Genes, brains and lateralisation (22 Dec 2012) Genetic variation and neuroimaging (11 Jan 2013) Have we become slower and dumber? (15 May 2013) Overhyped genetic findings: the case of dyslexia (16 Jun 2013) Incomprehensibility of much neurogenetics research ( 1 Oct 2016) A common misunderstanding of natural selection (8 Jan 2017) Sample selection in genetic studies: impact of restricted range (23 Apr 2017) Pre-registration or replication: the need for new standards in neurogenetic studies (1 Oct 2017) Review of 'Innate' by Kevin Mitchell ( 15 Apr 2019) Why eugenics is wrong (18 Feb 2020)

Neuroscience
Neuroprognosis in dyslexia (22 Dec 2010) Brain scans show that… (11 Jun 2011)  Time for neuroimaging (and PNAS) to clean up its act (5 Mar 2012) Neuronal migration in language learning impairments (2 May 2012) Sharing of MRI datasets (6 May 2012) Genetic variation and neuroimaging (1 Jan 2013) The arcuate fasciculus and word learning (11 Aug 2013) Changing children's brains (17 Aug 2013) What is educational neuroscience? ( 25 Jan 2014) Changing the landscape of psychiatric research (11 May 2014) Incomprehensibility of much neurogenetics research ( 1 Oct 2016)

Reproducibility
Accentuate the negative (26 Oct 2011) Novelty, interest and replicability (19 Jan 2012) High-impact journals: where newsworthiness trumps methodology (10 Mar 2013) Who's afraid of open data? (15 Nov 2015) Blogging as post-publication peer review (21 Mar 2013) Research fraud: More scrutiny by administrators is not the answer (17 Jun 2013) Pressures against cumulative research (9 Jan 2014) Why does so much research go unpublished? (12 Jan 2014) Replication and reputation: Whose career matters? (29 Aug 2014) Open code: note just data and publications (6 Dec 2015) Why researchers need to understand poker ( 26 Jan 2016) Reproducibility crisis in psychology ( 5 Mar 2016) Further benefit of registered reports ( 22 Mar 2016) Would paying by results improve reproducibility? ( 7 May 2016) Serendipitous findings in psychology ( 29 May 2016) Thoughts on the Statcheck project ( 3 Sep 2016) When is a replication not a replication? (16 Dec 2016) Reproducible practices are the future for early career researchers (1 May 2017) Which neuroimaging measures are useful for individual differences research? (28 May 2017) Prospecting for kryptonite: the value of null results (17 Jun 2017) Pre-registration or replication: the need for new standards in neurogenetic studies (1 Oct 2017) Citing the research literature: the distorting lens of memory (17 Oct 2017) Reproducibility and phonics: necessary but not sufficient (27 Nov 2017) Improving reproducibility: the future is with the young (9 Feb 2018) Sowing seeds of doubt: how Gilbert et al's critique of the reproducibility project has played out (27 May 2018) Preprint publication as karaoke ( 26 Jun 2018) Standing on the shoulders of giants, or slithering around on jellyfish: Why reviews need to be systematic ( 20 Jul 2018) Matlab vs open source: costs and benefits to scientists and society ( 20 Aug 2018) Responding to the replication crisis: reflections on Metascience 2019 (15 Sep 2019) Manipulated images: hiding in plain sight (13 May 2020) Frogs or termites: gunshot or cumulative science? ( 6 Jun 2020) Open data: We know what's needed - now let's make it happen (27 Mar 2021)  

Statistics
Book review: biography of Richard Doll (5 Jun 2010) Book review: the Invisible Gorilla (30 Jun 2010) The difference between p < .05 and a screening test (23 Jul 2010) Three ways to improve cognitive test scores without intervention (14 Aug 2010) A short nerdy post about the use of percentiles (13 Apr 2011) The joys of inventing data (5 Oct 2011) Getting genetic effect sizes in perspective (20 Apr 2012) Causal models of developmental disorders: the perils of correlational data (24 Jun 2012) Data from the phonics screen (1 Oct 2012)Moderate drinking in pregnancy: toxic or benign? (1 Nov 2012) Flaky chocolate and the New England Journal of Medicine (13 Nov 2012) Interpreting unexpected significant results (7 June 2013) Data analysis: Ten tips I wish I'd known earlier (18 Apr 2014) Data sharing: exciting but scary (26 May 2014) Percentages, quasi-statistics and bad arguments (21 July 2014) Why I still use Excel ( 1 Sep 2016) Sample selection in genetic studies: impact of restricted range (23 Apr 2017) Prospecting for kryptonite: the value of null results (17 Jun 2017) Prisons, developmental language disorder, and base rates (3 Nov 2017) How Analysis of Variance Works (20 Nov 2017) ANOVA, t-tests and regression: different ways of showing the same thing (24 Nov 2017) Using simulations to understand the importance of sample size (21 Dec 2017) Using simulations to understand p-values (26 Dec 2017) One big study or two small studies? ( 12 Jul 2018) Time to ditch relative risk in media reports (23 Jan 2020)

Journalism/science communication
Orwellian prize for scientific misrepresentation (1 Jun 2010) Journalists and the 'scientific breakthrough' (13 Jun 2010) Science journal editors: a taxonomy (28 Sep 2010) Orwellian prize for journalistic misrepresentation: an update (29 Jan 2011) Academic publishing: why isn't psychology like physics? (26 Feb 2011) Scientific communication: the Comment option (25 May 2011)  Publishers, psychological tests and greed (30 Dec 2011) Time for academics to withdraw free labour (7 Jan 2012) 2011 Orwellian Prize for Journalistic Misrepresentation (29 Jan 2012) Time for neuroimaging (and PNAS) to clean up its act (5 Mar 2012) Communicating science in the age of the internet (13 Jul 2012) How to bury your academic writing (26 Aug 2012) High-impact journals: where newsworthiness trumps methodology (10 Mar 2013)  A short rant about numbered journal references (5 Apr 2013) Schizophrenia and child abuse in the media (26 May 2013) Why we need pre-registration (6 Jul 2013) On the need for responsible reporting of research (10 Oct 2013) A New Year's letter to academic publishers (4 Jan 2014) Journals without editors: What is going on? (1 Feb 2015) Editors behaving badly? (24 Feb 2015) Will Elsevier say sorry? (21 Mar 2015) How long does a scientific paper need to be? (20 Apr 2015) Will traditional science journals disappear? (17 May 2015) My collapse of confidence in Frontiers journals (7 Jun 2015) Publishing replication failures (11 Jul 2015) Psychology research: hopeless case or pioneering field? (28 Aug 2015) Desperate marketing from J. Neuroscience ( 18 Feb 2016) Editorial integrity: publishers on the front line ( 11 Jun 2016) When scientific communication is a one-way street (13 Dec 2016) Breaking the ice with buxom grapefruits: Pratiques de publication and predatory publishing (25 Jul 2017) Should editors edit reviewers? ( 26 Aug 2018) Corrigendum: a word you may hope never to encounter (3 Aug 2019) Percent by most prolific author score and editorial bias (12 Jul 2020) PEPIOPs – prolific editors who publish in their own publications (16 Aug 2020) Faux peer-reviewed journals: a threat to research integrity (6 Dec 2020) Time to ditch relative risk in media reports (23 Jan 2020) Time for publishers to consider the rights of readers as well as authors (13 Mar 2021) Universities vs Elsevier: who has the upper hand? (14 Nov 2021) Book Review. Fiona Fox: Beyond the Hype (12 Apr 2022)

Social Media
A gentle introduction to Twitter for the apprehensive academic (14 Jun 2011) Your Twitter Profile: The Importance of Not Being Earnest (19 Nov 2011) Will I still be tweeting in 2013? (2 Jan 2012) Blogging in the service of science (10 Mar 2012) Blogging as post-publication peer review (21 Mar 2013) The impact of blogging on reputation ( 27 Dec 2013) WeSpeechies: A meeting point on Twitter (12 Apr 2014) Email overload ( 12 Apr 2016) How to survive on Twitter - a simple rule to reduce stress (13 May 2018)

Academic life
An exciting day in the life of a scientist (24 Jun 2010) How our current reward structures have distorted and damaged science (6 Aug 2010) The challenge for science: speech by Colin Blakemore (14 Oct 2010) When ethics regulations have unethical consequences (14 Dec 2010) A day working from home (23 Dec 2010) Should we ration research grant applications? (8 Jan 2011) The one hour lecture (11 Mar 2011) The expansion of research regulators (20 Mar 2011) Should we ever fight lies with lies? (19 Jun 2011) How to survive in psychological research (13 Jul 2011) So you want to be a research assistant? (25 Aug 2011) NHS research ethics procedures: a modern-day Circumlocution Office (18 Dec 2011) The REF: a monster that sucks time and money from academic institutions (20 Mar 2012) The ultimate email auto-response (12 Apr 2012) Well, this should be easy…. (21 May 2012) Journal impact factors and REF2014 (19 Jan 2013)  An alternative to REF2014 (26 Jan 2013) Postgraduate education: time for a rethink (9 Feb 2013)  Ten things that can sink a grant proposal (19 Mar 2013)Blogging as post-publication peer review (21 Mar 2013) The academic backlog (9 May 2013)  Discussion meeting vs conference: in praise of slower science (21 Jun 2013) Why we need pre-registration (6 Jul 2013) Evaluate, evaluate, evaluate (12 Sep 2013) High time to revise the PhD thesis format (9 Oct 2013) The Matthew effect and REF2014 (15 Oct 2013) The University as big business: the case of King's College London (18 June 2014) Should vice-chancellors earn more than the prime minister? (12 July 2014)  Some thoughts on use of metrics in university research assessment (12 Oct 2014) Tuition fees must be high on the agenda before the next election (22 Oct 2014) Blaming universities for our nation's woes (24 Oct 2014) Staff satisfaction is as important as student satisfaction (13 Nov 2014) Metricophobia among academics (28 Nov 2014) Why evaluating scientists by grant income is stupid (8 Dec 2014) Dividing up the pie in relation to REF2014 (18 Dec 2014)  Shaky foundations of the TEF (7 Dec 2015) A lamentable performance by Jo Johnson (12 Dec 2015) More misrepresentation in the Green Paper (17 Dec 2015) The Green Paper’s level playing field risks becoming a morass (24 Dec 2015) NSS and teaching excellence: wrong measure, wrongly analysed (4 Jan 2016) Lack of clarity of purpose in REF and TEF ( 2 Mar 2016) Who wants the TEF? ( 24 May 2016) Cost benefit analysis of the TEF ( 17 Jul 2016)  Alternative providers and alternative medicine ( 6 Aug 2016) We know what's best for you: politicians vs. experts (17 Feb 2017) Advice for early career researchers re job applications: Work 'in preparation' (5 Mar 2017) Should research funding be allocated at random? (7 Apr 2018) Power, responsibility and role models in academia (3 May 2018) My response to the EPA's 'Strengthening Transparency in Regulatory Science' (9 May 2018) More haste less speed in calls for grant proposals ( 11 Aug 2018) Has the Society for Neuroscience lost its way? ( 24 Oct 2018) The Paper-in-a-Day Approach ( 9 Feb 2019) Benchmarking in the TEF: Something doesn't add up ( 3 Mar 2019) The Do It Yourself conference ( 26 May 2019) A call for funders to ban institutions that use grant capture targets (20 Jul 2019) Research funders need to embrace slow science (1 Jan 2020) Should I stay or should I go: When debate with opponents should be avoided (12 Jan 2020) Stemming the flood of illegal external examiners (9 Feb 2020) What can scientists do in an emergency shutdown? (11 Mar 2020) Stepping back a level: Stress management for academics in the pandemic (2 May 2020)
TEF in the time of pandemic (27 Jul 2020) University staff cuts under the cover of a pandemic: the cases of Liverpool and Leicester (3 Mar 2021) Some quick thoughts on academic boycotts of Russia (6 Mar 2022)

Celebrity scientists/quackery
Three ways to improve cognitive test scores without intervention (14 Aug 2010) What does it take to become a Fellow of the RSM? (24 Jul 2011) An open letter to Baroness Susan Greenfield (4 Aug 2011) Susan Greenfield and autistic spectrum disorder: was she misrepresented? (12 Aug 2011) How to become a celebrity scientific expert (12 Sep 2011) The kids are all right in daycare (14 Sep 2011)  The weird world of US ethics regulation (25 Nov 2011) Pioneering treatment or quackery? How to decide (4 Dec 2011) Psychoanalytic treatment for autism: Interviews with French analysts (23 Jan 2012) Neuroscientific interventions for dyslexia: red flags (24 Feb 2012) Why most scientists don't take Susan Greenfield seriously (26 Sept 2014) NeuroPointDX's blood test for Autism Spectrum Disorder ( 12 Jan 2019)

Women
Academic mobbing in cyberspace (30 May 2010) What works for women: some useful links (12 Jan 2011) The burqua ban: what's a liberal response (21 Apr 2011) C'mon sisters! Speak out! (28 Mar 2012) Psychology: where are all the men? (5 Nov 2012) Should Rennard be reinstated? (1 June 2014) How the media spun the Tim Hunt story (24 Jun 2015)

Politics and Religion
Lies, damned lies and spin (15 Oct 2011) A letter to Nick Clegg from an ex liberal democrat (11 Mar 2012) BBC's 'extensive coverage' of the NHS bill (9 Apr 2012) Schoolgirls' health put at risk by Catholic view on vaccination (30 Jun 2012) A letter to Boris Johnson (30 Nov 2013) How the government spins a crisis (floods) (1 Jan 2014) The alt-right guide to fielding conference questions (18 Feb 2017) We know what's best for you: politicians vs. experts (17 Feb 2017) Barely a good word for Donald Trump in Houses of Parliament (23 Feb 2017) Do you really want another referendum? Be careful what you wish for (12 Jan 2018) My response to the EPA's 'Strengthening Transparency in Regulatory Science' (9 May 2018) What is driving Theresa May? ( 27 Mar 2019) A day out at 10 Downing St (10 Aug 2019) Voting in the EU referendum: Ignorance, deceit and folly ( 8 Sep 2019) Harry Potter and the Beast of Brexit (20 Oct 2019) Attempting to communicate with the BBC (8 May 2020) Boris bingo: strategies for (not) answering questions (29 May 2020) Linking responsibility for climate refugees to emissions (23 Nov 2021) Response to Philip Ball's critique of scientific advisors (16 Jan 2022) Boris Johnson leads the world ....in the number of false facts he can squeeze into a session of PMQs (20 Jan 2022) Some quick thoughts on academic boycotts of Russia (6 Mar 2022)

Humour and miscellaneous Orwellian prize for scientific misrepresentation (1 Jun 2010) An exciting day in the life of a scientist (24 Jun 2010) Science journal editors: a taxonomy (28 Sep 2010) Parasites, pangolins and peer review (26 Nov 2010) A day working from home (23 Dec 2010) The one hour lecture (11 Mar 2011) The expansion of research regulators (20 Mar 2011) Scientific communication: the Comment option (25 May 2011) How to survive in psychological research (13 Jul 2011) Your Twitter Profile: The Importance of Not Being Earnest (19 Nov 2011) 2011 Orwellian Prize for Journalistic Misrepresentation (29 Jan 2012) The ultimate email auto-response (12 Apr 2012) Well, this should be easy…. (21 May 2012) The bewildering bathroom challenge (19 Jul 2012) Are Starbucks hiding their profits on the planet Vulcan? (15 Nov 2012) Forget the Tower of Hanoi (11 Apr 2013) How do you communicate with a communications company? ( 30 Mar 2014) Noah: A film review from 32,000 ft (28 July 2014) The rationalist spa (11 Sep 2015) Talking about tax: weasel words ( 19 Apr 2016) Controversial statues: remove or revise? (22 Dec 2016) The alt-right guide to fielding conference questions (18 Feb 2017) My most popular posts of 2016 (2 Jan 2017) An index of neighbourhood advantage from English postcode data ( 15 Sep 2018) Working memories: A brief review of Alan Baddeley's memoir ( 13 Oct 2018)