Tuesday, 6 December 2022

Biomarkers to screen for autism (again)


Diagnosis of autism from biomarkers is a holy grail for biomedical researchers. The days when it was thought we would find “the autism gene” are long gone, and it’s clear that both the biology and the psychology of autism is highly complex and heterogeneous. One approach is to search for individual genes where mutations are more likely in those with autism. Another is to address the complexity head-on by looking for combinations of biomarkers that could predict who has autism.  The latter approach is adopted in a paper by Bao et al (2022) who claimed that an ensemble of gene expression measures taken from blood samples could accurately predict which toddlers were autistic (ASD) and which were typically-developing (TD). An anonymous commenter on PubPeer queried whether the method was as robust as the authors claimed, arguing that there was evidence for “overfitting”. I was asked for my thoughts by a journalist, and they were complicated enough to merit a blogpost.  The bottom line is that there are reasons to be cautious about the conclusion of the authors that they have developed “an innovative and accurate ASD gene expression classifier”.

 

Some of the points I raise here applied to a previous biomarker study that I blogged about in 2019. These are general issues about the mismatch between what is done in typical studies in this area and what is needed for a clinically useful screening test.

 

Base rates

Consider first how a screening test might be used. One possibility is that there might be a move towards universal screening, allowing early diagnosis that might help ensure intervention starts young.  But for effective screening in that context, you need extremely high diagnostic accuracy, and accuracy depends on the frequency of autism in the population.  I discussed this back in 2010. The levels of accurate classification reported by Bao et al would be of no use for population screening because there would be an extremely high rate of false positives, given that most children don’t have autism.

 

Diagnostic specificity

But, you may say, we aren’t talking about universal screening.  The test might be particularly useful for those who either (a) already have an older child with autism, or (b) are concerned about their child’s development.  Here the probability of a positive autism diagnosis is higher than in the general population.  However, if that’s what we are interested in, then we need a different comparison group – not typically-developing toddlers, but unaffected siblings of children with autism, and/or children with other neurodevelopmental disorders.   

When I had a look at the code that the authors deposited for data analysis, it implied that they did have data on children with more general developmental delays, and sibs of those with autism, but they are not reported in this paper. 

 

The analyses done by the researchers are extremely complex and time-consuming, and it is understandable that they may prefer to start out with the clearest case of comparing autism with typically-developing children. But the acid test of the suitability of the classifier for clinical use would be a demonstration that it could distinguish children with autism from unaffected siblings, and from nonautistic children with intellectual disability.

 

Reliability of measures

If you run a diagnostic test, an obvious question is whether you’d get the same result on a second test run.  With biological and psychological measures the answer is almost always no, but the key issue for a screener is just how much change there is. Gene expression levels could vary from occasion to occasion depending on time of day or what you’d eaten – I have no idea how important this might be, but it's not possible to evaluate in this paper, where measures come from a single blood sample. My personal view is that the whole field of biomedical research needs to wake up to the importance of reliability of measurement so that researchers don’t waste time exploring the predictive power of measures that may be too unreliable to be useful.  Information about stability of measures over time is a basic requirement for any diagnostic measure.

 

A related issue concerns comparability of procedures for autism and TD groups. Were blood samples collected by the same clinicians over the same period and processed in the same lab for these two groups? Were the blood analyses automated and/or done blind? It’s crucial to be confident that minor differences in clinical or lab procedures do not bias results in this kind of study.

 

Overfitting

Overfitting is really just a polite way of saying that the data may be noise. If you run enough analyses, something is bound to look significant, just by chance.  In the first step of the analysis, the researchers ran 42,840 models on “training” data from 93 autistic and 82 TD children and found 1,822 of them performed better than .80 on a measure that reflects diagnostic accuracy (AUC-ROC – which roughly corresponds to proportion correctly classified: .50 is chance, and 1.00 is perfect classification).  So we can see that just over 4% of the models (1822/42840) performed this well.

 

The researchers were aware of the possibility of overfitting, and they addressed it head-on, saying: “To test this, we permuted the sample labels (i.e., ASD and TD) for all subjects in our Training set and ran the pipeline to test all feature engineering and classification methods. Importantly, we tested all 42,840 candidate models and found the median AUC-ROC score was 0.5101 with the 95th CI (0.42–0.65) on the randomized samples. As expected, only rare chance instances of good 'classification' occurred.”  The distribution of scores is shown in Figure 2b. 

 

 


Figure 2b from Bao et al (2022)

 

They then ran a further analysis on a “test set” of 34 autistic and 31 TD children who had been held out of the original analysis, and found that 742 of the 1822 models performed better than .80 in classification. That’s 40% of the tested models.  Assuming I have understood the methods correctly, that does look meaningful and hard to explain just in terms of statistical noise.  In effect, they have run a replication study and found that a substantial subset of the identified models do continue to separate autism and TD groups when new children are considered. The claim is that there is substantial overlap in the models that fall in the right-hand area under the curve for the red and pink distributions.

 

The PubPeer commenter seems concerned that results look too good to be true. In particular, Figure 2b suggests the models perform a bit better in the test set than in the training set. But the figure shows the distribution of scores for all the models (not just the selected models) and, given the small sample sizes, the differences between distributions does not seem large to me. I was more surprised by the relatively tight distribution of AUC-ROC values obtained in the permutation analysis, as I would have anticipated some models would have given high classification accuracy just by chance in a sample of this size.

The researchers went on to present data for the set of models that achieved .8 classification in both training and test sets. This seemed a reasonable approach to me. The PubPeer commenter is correct in arguing that there will be some bias caused by selecting models this way, and that one would expect  less good performance in a completely new sample, but the 2-stage selection of models would seem to ensure there is not "massive overfitting". I think there would be a problem if only 4% of the 1822 selected models had given accurate classification, but the good rate of agreement between the models selected in the training and test samples, coupled with the lack of good models in the permuted data, suggests there is a genuine effect here. 

 

Conclusion

So, in sum, I think that the results can’t just be attributed to overfitting, but I nevertheless have reservations about whether they would be useful for screening for autism.  And one of the first things I’d check if I were the researchers would be the reliability of the diagnostic classification in repeated blood samples taken on different occasions, as that would need to be high for the test to be of clinical use.

 

Note: I'd welcome comments or corrections on this post. Please note, comments are moderated to avoid spam, and so may not appear immediately. If you post a comment and it has not appeared in 24 hr, please email me and I'll ensure it gets posted. 

 PS. See comment from original PubPeer poster attached. 

Also, 8th Dec 2022, I added a further PubPeer comment asking authors to comment on Figure 2B, which does seem odd. 

https://pubpeer.com/publications/B693366B2B51D143C713359F151F7B#4 

 


 

 

 

 

 

 

 

 

Wednesday, 12 October 2022

What is going on in Hindawi special issues?

A guest blogpost by Nick Wise 

 http://www.eng.cam.ac.uk/profiles/nhw24


The Hindawi journal Wireless Communications and Mobile Computing is booming. Until a few years ago they published 100-200 papers a year, however they published 269 papers in 2019, 368 in 2020 and 1,212 in 2021. So far in 2022 they have published 2,429. This growth has been achieved primarily by the creation of special issues, which makes sense. It would be nearly impossible for a journal to increase its publication rate by an order of magnitude in 2 years without outsourcing the massive increase in workload to guest editors.

Recent special issues include ‘Machine Learning Enabled Signal Processing Techniques for Large Scale 5G and 5G Networks’ (182 articles), ‘Explorations in Pattern Recognition and Computer Vision for Industry 4.0’ (244) and ‘Fusion of Big Data Analytics, Machine Learning and Optimization Algorithms for Internet of Things’ (204). Each of these special issues contains as many papers as the journal published in a year until recently. They also contain many papers that are flagged on Pubpeer for irrelevant citations, tortured phrases and surprising choices of corresponding email addresses.

However, I am going to focus on one special issue that is still open for submissions, and so far contains a modest 62 papers: ‘AI-Driven Wireless Energy Harvesting in Massive IoT for 5G and Beyond’, edited by Hamurabi Gamboa Rosales, Danijela Milosevic and Dijana Capeska Bogatinoska. Given the title of the special issue, it is perhaps surprising that only two of the articles contain ‘wireless’ in the title and none contain ‘energy’. The authors of the other papers (or whoever submitted them) appear to have realised that as long as they included the buzzwords ‘AI’, ‘IoT’ (Internet of Things) or ‘5G’ in the title, the paper could be about anything at all. Hence, the special issue contains titles such as:

  • Analysis Model of the Guiding Role of National Sportsmanship on the Consumer Market of Table Tennis and Related IoT Applications 
  • Evaluation Method of the Metacognitive Ability of Chinese Reading Teaching for Junior Middle School Students Based on Dijkstra Algorithm and IoT Applications 
  • The Construction of Shared Wisdom Teaching Practice through IoT Based on the Perspective of Industry-Education Integration

Of the 62 papers, 60 give Hamurabi Gamboa Rosales as the academic editor and 2 give Danijela Milosevic. Why is the distribution of labour so lopsided? One can imagine an arrangement where the lead editor does the admin of waving through irrelevant papers and the other 2 guest editors get to say that they’ve guest-edited a special issue on their CV.

Of course, in addition to boosting publication numbers for the authors and providing CV points for the guest editors, every paper in the special issue has a references section. Each reference gives someone a citation, another academic brownie point on which careers can be built. An anonymous Pubpeer sleuth has trawled through the references section of every paper in this special issue and found that Malik Bader Alazzam of Amman Arab University in Jordan has been cited 139 times across the 62 papers. The chance that the authors of almost every article would independently decide to cite the same person seems small.

The most intriguing fact about the papers in the special issue however, is that only 4 authors give corresponding email addresses that match their affiliation. These 4 include the only 3 papers with non-Chinese authors. Of the other 58, 1 uses an email address from Guangzhou University, 6 use email addresses from Changzhou University, and 51 use email addresses from Ma’anshan University. All of the Ma’anshan addresses are of the form 1940XXXX@masu.edu.cn and many are nearly sequential, suggesting that someone somewhere purchased a block of sequential email addresses (you do not need to be at Ma’anshan University to have an @masu email address). The screenshot below shows a sample (the full dataset is linked here).

A subset of the titles from the special issue with their corresponding email addresses, all of the form 1940XXXX@masu.edu.cn

The use and form of the email addresses suggests that all of these papers are the work of a paper mill. It is hard to imagine otherwise how 51 different authors could submit papers to the same special issue using the same institutional email domain and format. Indeed, before 2022 only 2 papers had ever used @masu.edu.cn as a corresponding address according to Dimensions. It is equally hard to imagine how Hamurabi Gamboa Rosales is unaware. How can you not notice that, of the 19 papers you receive for your special issue on the 12th of July, 18 use the same email domain that doesn’t match their affiliation? This may also explain why Hamurabi has dealt with almost all the papers himself. This special issue should be closed for submissions and an investigation begun.

Stepping back from this special issue, this is not an isolated problem. There are at least 40 other papers published in Wireless Communications and Mobile Computing with corresponding emails from Ma’anshan, and Dimensions finds there are 46 in Computational Intelligence and Neuroscience, 38 in Computational and Mathematical Methods in Medicine and 30 in Mobile Information Systems, all published in 2022 and all in Hindawi journals. What are the chances that 18404032@masu.edu.cn is used in a special issue in Computational Intelligence and Neuroscience, 18404038@masu.edu.cn in Disease Markers and 18404041@masu.edu.cn in Wireless Communications and Mobile Computing?

Finally, masu.edu.cn is only one example of a commonly used email domain that doesn’t match the author’s affiliation. It is conceivable that the entire growth in publications of Wireless Communications and Mobile Computing, Computational Intelligence and Neuroscience (163 articles in 2020, 3,079 in 2022) and Computational and Mathematical Methods in Medicine (225 in 2020, 1,488 in 2022) is from paper mills publishing in corrupted special issues.


Nick Wise


*All numbers accurate as of the 12th October 2022.

Tuesday, 4 October 2022

A desire for clickbait can hinder an academic journal's reputation

 


On 28th September, I woke up to look at Twitter and find Pete Etchells fulminating about a piece in the Guardian.  

It was particularly galling for him to read a piece that implied research studies had shown that voice-responsive devices were harming children’s development when he and Amy Orben had provided comments to the Science Media Centre that were available to the journalist. They both noted that: 

a) This was a Viewpoint piece, not new research 

b) Most of the evidence it provided consisted of anecdotes from newspaper articles

I agreed with Pete’s criticism of the Guardian, but having read the original Viewpoint in the Archives of Disease in Childhood, I had another question, namely, why on earth was a reputable paediatrics journal doing a press release on a flimsy opinion piece written by two junior medics with no track record in the area? 

So I wrote to the Editor with my concerns, as follows: 

Dear Dr Brown 

Viewpoint: Effects of smart voice control devices on children: current challenges and future perspectives doi 10.1136/archdischild-2022-323888 Journal: Archives of Disease in Childhood  

I am writing to enquire why this Viewpoint was sent out to the media under embargo as if it was a substantial piece of new research. I can understand that you might want to publish less formal opinion pieces from time to time, but what I cannot understand is the way this was done to attract maximum publicity by the media. 

The two people who commented about it for the Science Media Centre both noted this was an opinion piece with no new evidence, relying mainly on media reports. 

https://www.sciencemediacentre.org/expert-reaction-to-an-opinion-piece-on-voice-controlled-devices-and-child-development/ 

Unfortunately, despite this warning, it has been picked up by the mainstream media, where it is presented as ‘new research’, which will no doubt give parents of young children something new to worry about. 

I checked out the authors, and found these details: 

https://orcid.org/0000-0003-4881-8293 

https://www.researchgate.net/profile/Ananya-Arora-3 

These confirm that neither has a strong research track record, or any evidence of expertise in the topic of the Viewpoint. I can only assume that ADC is desperate for publicity at any cost, regardless of scientific evidence or impact on the public. 

As an Honorary Fellow of the Royal College of Paediatrics and Child Health, and someone who has previously published in ADC, I am very disappointed to see the journal sink so low. 

Yesterday I got a reply that did nothing to address my concerns. Here’s what the editor, Nick Brown*, said (in italic), with my reactions added: 

Thank you for making contact . My response reflects the thoughts of both the BMJ media and publication departments  

Given my reactions, below, this is more worrying than reassuring. It would be preferable to have heard that there had been some debate as to the wisdom of promoting this article to the press. 

It is a key role of a scientific journal to raise awareness of, and stimulate debate on, live and emerging issues. Voice control devices are becoming increasingly common and their impact on children's development is a legitimate topic of discussion.  

I have no quarrel with the idea that impact of voice control devices on children is a legitimate topic for the journal. But I wonder about how far its role is ‘raising awareness of, and stimulating debate’ when the topic is one on which we have very little evidence. A scientific journal might be expected to provide a balanced account of evidence, whereas the Viewpoint presented one side of the ‘debate’, mainly using anecdotes. I doubt it would have been published if it had concluded that there was no negative impact of voice control devices.  

Opinion pieces are part of a very wide range of content that is selected for press release from among BMJ's portfolio of journals. They are subject to internal review in line with BMJ journals´overall editorial policy: the process (intentionally) doesn't discriminate against authors who don't have a strong research track record in a particular field  

I’ve been checking up on how frequently ADC promotes an article for press release. This information can be obtained here. This year, they have published 219 papers, of which three other articles have merited a press release: an analysis of survey data on weight loss (July), a research definition of Long Covid in children (February) and a data-based analysis of promotional claims about baby food (February). Many of the papers that were not press-released are highly topical and of general interest – a quick scan found papers on vaping, monkey pox, transgender adolescents, unaccompanied minors as asylum seekers, as well as many papers relating to Covid. It’s frankly baffling why a weakly evidenced viewpoint on a topic with little evidence was selected as meriting special treatment with a press release. 

As for the press release pathway itself, all potential pieces are sent out under embargo, irrespective of article type. This maximises the chances of balanced coverage: an embargo period enables journalists to contact the authors with any queries and to contact other relevant parties for comment. 

My wording may have been clumsy here and led to misunderstanding. My concern was more with the fact that the paper was press-released, which is, as established above, highly unusual, rather than with the embargo.  

The press release clearly stated (3 times) this article was a viewpoint and not new research, and that it hadn't been externally peer reviewed. We also always include a direct URL link to the article in question in our press releases so that journalists can read the content in full for themselves. 

I agree that the press release included these details, and indeed, had journalists consulted the Science Media Centre’s commentaries, the lack of peer review and data would have been evident. But nevertheless, it’s well-known that (a) journalists seldom read original sources, and (b) some of the less reputable newspapers are looking for  clickbait, so why provide them with the opportunity for sensationalising journal content?

While we do all we can to ensure that journalists cover our content responsibly, we aren't responsible for the manner in which they choose to do so. 

I agree that part of the blame for the media coverage lies with journalists. But I think the journal must bear some responsibility for the media uptake of the article. It’s a reasonable assumption that if a reputable journal issues a press release, it’s because the article in question is important and provides novel information from recognised experts in the field. It is unfortunate that that assumption was not justified in that case. 

I just checked to see how far the media interest in the story had developed. The Guardian, confronted with criticism, changed the lede to say “Researchers suggest”, rather than “New research says”, but the genie was well out of the bottle by that time. The paper has an Altmetric ‘attention’ score of 1577, and been picked up by 209 news outlets. There’s no indication that the article has “stimulated debate”. Rather it has been interpreted as providing a warning about a new danger facing children. The headlines, which can be found here, are variants of: 

 “Alexa and Siri make children rude” 

“Siri, Alexa and Google Home could hinder children’s social and cognitive development” 

“Voice-control devices may have an impact on children’s social, emotional development: Study” 

“According to a study, voice-controlled electronic aides can impair children’s development” 

“Experts warn that AI assistants affect children’s social development” 

“Experts warn AI assistants affect social growth of children” 

“Why Alexa and Siri may damage kids’ social and emotional development” 

“Voice assistants harmful for your child’s development, claims study” 

“Alexa, Siri, and Other Voice Assistants could negatively rewire your child’s brain” 

“Experts warn using Alexa and Siri may be bad for children” 

“Parents issued stark warning over kids using Amazon’s Alexa” 

“Are Alexa and Siri making our children DUMB?” 

“Use of voice-controlled devices ‘might have long-term consequences for children’” 

And most alarmingly, from the Sun: 

“Urgent Amazon Alexa warning for ALL parents as new danger revealed” 

Maybe the journal’s press office regards that as a success. I think it’s a disaster for the journal’s reputation as a serious academic journal. 

 

*Not the sleuth Nick Brown. Another one. 

Friday, 30 September 2022

Reviewer-finding algorithms: the dangers for peer review

 


Last week many words were written for Peer Review Week, so you might wonder whether there is anything left to say. I may have missed something, but I think I do have a novel take on this, namely to point out that some recent developments in automation may be making journals vulnerable to fake peer review. 

Finding peer reviewers is notoriously difficult these days. Editors are confronted with a barrage of submissions, many outside their area. They can ask authors to recommend peer reviewers, but this raises concerns of malpractice, if authors recommend friends, or even individuals tied up with paper mills, who might write a positive review in return for payment.

One way forward is to harness the power of big data to identify researchers who have a track record of publishing in a given area. Many publishers now use such systems. This way a journal editor can select from a database of potential reviewers that is formed by identifying papers with some overlap to a given submission.

I have become increasingly concerned, however, that use of algorithmically-based systems might leave a journal vulnerable to fraudulent peer reviewers who have accumulated publications by using paper mills. I became interested in this when submitting work to Wellcome Open Research and F1000, where open peer review is used, but it is the author rather than an editor who selects reviewers. Clearly, with such a system, one needs to be careful to avoid malpractice, and strict criteria are imposed. As explained here,  reviewers need to be:
  1. Qualified: typically hold a doctorate (PhD/MD/MBBS or equivalent). 
  2. Expert: have published at least three articles as lead author in a relevant topic, with at least one article having been published in the last five years. 
  3. Impartial: No competing interests and no co-authorship or institutional overlap with current authors. 
  4. Global: geographically diverse and from different institutions. 
  5. Diverse: in terms of gender, geographic location and career stage

Unfortunately, now that we have paper mills, which allow authors, for a fee, to generate and publish a large number of fake papers, these criteria are inadequate. Consider the case of Mohammed Sahab Uddin, who features in this account in Retraction Watch. As far as I am aware, he does not have a doctorate*, but I suspect people would be unlikely to query the qualifications of someone who had 137 publications and an H-index of 37. By the criteria above, he would be welcomed as a reviewer from an underrepresented location. And indeed, he was frequently used as a reviewer: Leslie McIntosh, who unmasked Uddin’s deception, noted that before he wiped his Publons profile, he had been listed as a reviewer on 300 papers. 

This is not an isolated case. We are only now beginning to get to grips with the scale of the problem of paper mills. There are undoubtedly many other cases of individuals who are treated as trusted reviewers on the back of fraudulent publications. Once in positions of influence, they can further distort the publication process. As I noted in last week's blogpost, open peer review offers a degree of defence against this kind of malpractice, as readers will at least be able to evaluate the peer review, but it is disturbing to consider how many dubious authors will have already found themselve promoted to positions of influence based on their apparently impressive track record of publishing, reviewing and even editing.

I started to think about this might interact with other moves to embrace artificial intelligence. A recent piece in Times Higher Education stated: “Research England has commissioned a study of whether artificial intelligence could be used to predict the quality of research outputs based on analysis of journal abstracts, in a move that could potentially remove the need for peer review from the Research Excellence Framework (REF).” This seems to me to be the natural endpoint of the move away from trusting the human brain in the publication process. We could end up with a system where algorithms write the papers, which are attributed to fake authors,  peer reviewed by fake peer reviewers, and ultimately evaluated in the Research Excellence Framework by machines. Such a system is likely to be far more successful than mere mortals, as it will be able to rapidly and flexibly adapt to changing evaluation criteria. At that point, we will have dispensed with the need for human academics altogether and have reached peak academia. 

 *Correction 30/9/22: Leslie McIntosh tells me he does have a doctorate and was working on a postdoc.

Sunday, 11 September 2022

So do we need editors?

It’s been an interesting week in world politics, and I’ve been distracting myself by pondering the role of academic editors. The week kicked off with a rejection of a preprint written with co-author Anna Abalkina, who is an expert sleuth who tracks down academic paper mills – organisations that will sell you a fake publication in an academic journal. Our paper describes a paper mill that had placed six papers in the Journal of Community Psychology, a journal which celebrated its 50th anniversary in 2021. We had expected rejection, as we submitted the paper to the Journal of Community Psychology, as a kind of stress test to see whether the editor, Michael B. Blank, actually reads papers that he accepts for the journal. I had started to wonder, because you can read his decision letters on Publons, and they are identical for every article he accepts. I suspected he may be an instance of Editoris Machina, or automaton, one who just delegates editorial work to an underling, waits until reviewer reports converge on a recommendation, and then accepts or rejects accordingly without actually reading the paper. I was wrong, though. He did read our paper, and rejected it with the comment that it was a superficial analysis of six papers. We immediately posted it as a preprint and plan to publish it elsewhere.

Although I was quite amused by all of this, it has a serious side. As we note in our preprint, when paper mills succeed in breaching the defences of a journal, this is not a victimless crime. First, it gives competitive advantage to the authors who paid the paper mill – they do this in order to have a respectable-looking publication that will help their career. I used to think this was a minor benefit, but when you consider that the paper mills can also ensure that the papers they place are heavily cited, you start to realise that authors can edge ahead on conventional indicators of academic prestige, while their more honest peers trail behind. The second set of victims are those who publish in the journal in good faith. Once its reputation is damaged by the evidence that there is no quality control, then all papers appearing in the journal are tainted by association. The third set of victims are busy academics who are trying to read and integrate the literature, who can get tangled up in the weeds as they try to navigate between useful and useless information. And finally, we need to be concerned about the cynicism induces in the general public when they realise that for some authors and editors, the whole business of academic publishing is a game, which is won not by doing good science, but by paying someone to pretend you have done so.

Earlier this week I shared my thoughts on the importance of ensuring that we have some kind of quality control over journal editors. They are, after all, the gatekeepers of science. When I wrote my taxonomy of journal editors back in 2010, I was already concerned at the times I had to deal with editors who were lazy or superficial in their responses to authors. I had not experienced ‘hands off’ editors in the early days of my research career, and I wondered how far this was a systematic change over time, or whether it was related to subject area. In the 1970s and 1980s, I mostly published in journals that dealt with psychology and/or language, and the editors were almost always heavily engaged with the paper, adding their own comments and suggesting how reviewer comments might be addressed. That’s how I understood the job when I myself was an editor. But when I moved to publishing work in journals that were more biological (genetics, neuroscience) things seemed different, and it was not uncommon to find editors who really did nothing more than collate peer reviews.

The next change I experienced was when, as a Wellcome-funded researcher, I started to publish in Wellcome Open Research (WOR), which adopts a very different publication model, based on that initiated by F1000. In this model, there is no academic editor. Instead, the journal employs staff who check that the paper complies with rigorous criteria: the proposed peer reviewers much have a track record of publishing and be clear from conflict of interest. Data and other materials must be openly available so that the work can be reproduced. And the peer review is published. The work is listed on PubMed if and when peer reviewers agree that it meets a quality threshold: otherwise the work remains visible but with status shown as not approved by peer review. 

The F1000/WOR model shows that editors are not needed, but I generally prefer to publish in journals that do have academic editors – provided the editor is engaged and does their job properly. My papers have benefitted from input from a wise and experienced editor on many occasions. In a specialist journal, such an editor will also know who are the best reviewers – those who have the expertise to give a detailed and fair appraisal of the work. However, in the absence of an engaged editor, I prefer the F1000/WOR model, where at least everything is transparent. The worst of all possible worlds is when you have an editor who doesn’t do more than collate peer reviews, but where everything is hidden: the outside world cannot know who the editor was, how decisions were made, who did the reviews, and what they said. Sadly, this latter situation seems to be pretty common, especially in the more biological realms of science. To test my intuitions, I ran a little Twitter poll for different disciplines, asking, for instance: 

 
 
 Results are below

% respondents stating Not Read, Read Superfially, or Read in Depth



 

Such polls of course have to be taken with a pinch of salt, as the respondents are self-selected, and the poll allows only very brief questions with no nuance. It is clear that within any one discipline, there is wide variability in editorial engagement. Nevertheless, I find it a matter of concern that in all areas, some respondents had experienced a journal editor who did not appear to have read the paper they had accepted, and in areas of biomedicine, neuroscience, and genetics, and also in mega journals, this was as high as around 25-33%

So my conclusion is that it is not surprising that we are seeing phenomena like paper mills, because the gatekeepers of the publication process are not doing their job. The solution would be either to change the culture for editors, or, where that is not feasible, to accept that we can do without editors. But if we go down that route, we should move to a model such as F1000 with much greater quality control over reviewers and COI, and much more openness and transparency.

 As usual comments are welcome: if you have trouble getting past comment moderation, please email me.

Tuesday, 6 September 2022

We need to talk about editors


Editoris spivia

The role of journal editor is powerful: you decide what is accepted or rejected for publication. Given that publications count as an academic currency – indeed in some institutions they are literally fungible – a key requirement for editors is that they are people of the utmost integrity. Unfortunately, there are few mechanisms in place to ensure editors are honest – and indeed there is mounting evidence that many are not. I argue here that we can no longer take editorial honesty for granted, and systems need to change to weed out dodgy editors if academic publishing is to survive as a useful way of advancing science. In particular, the phenomenon of paper mills has shone a spotlight on editorial malpractice.

Questionable editorial practices

Back in 2010, I described a taxonomy of journal editors based on my own experience as an author over the years. Some were negligent, others were lordly, and others were paragons – the kind of editor we all want, who is motivated solely by a desire for academic excellence, who uses fair criteria to select which papers are published, who aims to help an author improve their work, and provides feedback in a timely and considerate fashion. My categorisation omitted another variety of editor that I have sadly become acquainted with in the intervening years: the spiv. The spiv has limited interest in academic excellence: he or she sees the role of editor as an opportunity for self-advancement. This usually involves promoting the careers of self or friends by facilitating publication of their papers, often with minimal reviewing, and in some cases may go as far as working hand in glove with paper mills to receive financial rewards for placing fraudulent papers.

When I first discovered a publication ring that involved journal editors scratching one another’s backs, in the form of rapid publication of each other’s papers, I assumed this was a rare phenomenon. After I blogged about this, one of the central editors was replaced, but others remained in post. 

I subsequently found journals where the editor-in-chief authored an unusually high percentage of the articles published in the journal. I drew these to the attention of integrity advisors of the publishers that were involved, but did not get the impression that they regarded this as particularly problematic or were going to take any action about it. Interestingly, there was one editor, George Marcoulides, who featured twice in a list of editors who authored at least 15 articles in their own journal over a five year period. Further evidence that he equates his editorial role with omnipotence came when his name cropped up in connection with a scandal where a reviewer, Fiona Fidler, complained after she found her positive report on a paper had been modified by the editor to justify rejecting the paper: see this Twitter thread for details. It appears that the publishers regard this as acceptable: Marcoulides is still editor-in-chief at the Sage journal Educational and Psychological Measurement, and at Taylor and Francis’ Structural Equation Modeling, though his rate of publishing in both journals has declined since 2019; maybe someone had a word with him to explain that publishing most of your papers in a journal you edit is not a good look.

Scanff et al (2021) did a much bigger investigation of what they termed “self-promotion journals” - those that seemed to be treated as the personal fiefdom of editors, who would use the journal as an outlet for their own work. This followed on from a study by Locher et al (2021), which found editors who were ready to accept papers by a favoured group of colleagues with relatively little scrutiny. This had serious consequences when low-quality studies relating to the Covid-19 pandemic appeared in the literature and subsequently influenced clinical decisions. Editorial laxness appears in this case to have done real harm to public health.

So, it's doubtful that all editors are paragons. And this is hardly surprising: doing a good job as editor is hard and often thankless work. On the positive side, an editor may obtain kudos for being granted an influential academic role, but often there is little or no financial reimbursement for the many hours that must be dedicated to reading and evaluating papers, assigning reviewers, and dealing with fallout from authors who react poorly to having their papers rejected. Even if an editor starts off well, they may over time start to think “What’s in this for me?” and decide to exploit the opportunities for self-advancement offered by the position. The problem is that there seems little pressure to keep them on the straight and narrow; it's like when a police chief is corrupt. Nobody is there to hold them to account. 

Paper mills

Many people are shocked when they read about the phenomenon of academic paper mills – defined in a recent report by the Committee on Publication Ethics (COPE) and the Association of Scientific, Tehcnical and Medical Publishers (STM) as “the process by which manufactured manuscripts are submitted to a journal for a fee on behalf of researchers with the purpose of providing an easy publication for them, or to offer authorship for sale.” The report stated that “the submission of suspected fake research papers, also often associated with fake authorship, is growing and threatens to overwhelm the editorial processes of a significant number of journals.” It concluded with a raft of recommendations to tackle the problem from different fronts: changing the incentives adopted by institutions, investment in tools to detect paper mill publications, education of editors and reviewers to make them aware of paper mills, introduction of protocols to impede paper mills succeeding, and speeding up the process of retraction by publishers.

However, no attention was given to the possibility that journal editors may contribute to the problem: there is talk of “educating” them to be more aware of paper mills, but this is not going to be effective if the editor is complicit with the paper mill, or so disengaged from editing as to not care about them. 

It’s important to realise that not all paper mill papers are the same. Many generate outputs that look plausible. As Byrne and LabbĂ© (2017) noted, in biomedical genetic studies, fake papers are generated from a template that is based on a legitimate paper, and just vary in terms of the specific genetic sequence and/or phenotype that is studied. There are so many genetic sequences and phenotypes, that the number of possible combinations of these is immense. In such cases, a diligent editor may get tricked into accepting a fake paper, because the signs of fakery are not obvious and aren’t detected by reviewers. But at the other extreme, some products of paper mills are clearly fabricated. The most striking examples are those that contain what Guillaume Cabanac and colleagues term “tortured phrases”. These appear to be generated by taking segments of genuine articles and running them through an AI app that will use a thesaurus to alter words, with the goal of evading plagiarism detection software. In other cases, the starting point appears to be text from an essay mill. The results are often bizarre and so incomprehensible that one only needs read a few sentences to know that something is very wrong. Here’s an example from Elsevier’s International Journal of Biological Macromolecules, which those without access can pay $31.50 for (see analysis on Pubpeer, here).

"Wound recuperating camwood a chance to be postponed due to the antibacterial reliance of microorganisms concerning illustration an outcome about the infection, wounds are unable to mend appropriately, furthermore, take off disfiguring scares [150]. Chitin and its derivatives go about as simulated skin matrixes that are skilled should push a fast dermal redesign after constantly utilized for blaze treatments, chitosan may be wanton toward endogenous enzymes this may be a fundamental preference as evacuating those wound dressing camwood foundation trauma of the wounds and harm [151]. Chitin and its derivatives would make a perfect gas dressing. Likewise, they dampen the wound interface, are penetrability will oxygen, furthermore, permit vaporous exchange, go about as a boundary with microorganisms, and are fit about eliminating abundance secretions"

And here’s the start of an Abstract from a Springer Nature collection called Modern Approaches in Machine Learning and Cognitive Science (see here for some of the tortured phrases that led to detection of this article). The article can be yours for £19.95:

“Asthma disease are the scatters, gives that influence the lungs, the organs that let us to inhale and it’s the principal visit disease overall particularly in India. During this work, the matter of lung maladies simply like the trouble experienced while arranging the sickness in radiography are frequently illuminated. There are various procedures found in writing for recognition of asthma infection identification. A few agents have contributed their realities for Asthma illness expectation. The need for distinguishing asthma illness at a beginning period is very fundamental and is an exuberant research territory inside the field of clinical picture preparing. For this, we’ve survey numerous relapse models, k-implies bunching, various leveled calculation, characterizations and profound learning methods to search out best classifier for lung illness identification. These papers generally settlement about winning carcinoma discovery methods that are reachable inside the writing.”

These examples are so peculiar that even a layperson could detect the problem. In more technical fields, the fake paper may look superficially normal, but is easy to spot by anyone who knows the area, and who recognises that the term “signal to noise” does not mean “flag to commotion”, or that while there is such a thing as a “Swiss albino mouse” there is no such thing as a “Swiss pale-skinned person mouse”. These errors are not explicable as failures of translation by someone who does not speak good English. They would be detected by any reviewer with expertise in the field. Another characteristic of paper mill outputs, featured in this recent blogpost, are fake papers that combine tables and figures from different publications in nonsensical contexts.

Sleuths who are interested in unmasking paper mills have developed automated methods for identifying such papers, and the number is both depressing and astounding. As we have seen, though some of these outputs appear in obscure sources, many crop up in journals or edited collections that are handled by the big scientific publishing houses, such as Springer Nature, Elsevier and Wiley. When sleuths find these cases, they report the problems on the website PubPeer, and this typically raises an incredulous response as to how on earth did this material get published. It’s a very good question, and the answer has to be that somehow an editor let this material through. As explained in the COPE&STM report, sometimes a nefarious individual from a paper mill persuades a journal to publish a “special issue” and the unwitting journal is then hijacked and turned into a vehicle for publishing fraudulent work. If the special issue editor poses as a reputable scientist, using a fake email address that looks similar to the real thing, this can be hard to spot.

But in other cases, we see clearcut instances of paper mill outputs that have apparently been approved by a regular journal editor. In a recent preprint, Anna Abalkina and I describe finding putative paper mill outputs in a well-established Wiley journal, the Journal of Community Psychology. Anna identified six papers in the journal in the course of a much larger investigation of papers that came from faked email addresses. For five of them the peer review and editorial correspondence was available on Publons. The papers,  from addresses in Russia or Kazakhstan, were of very low quality and frequently opaque. I had to read and re-read to work out what the paper was about, and still ended up uncertain. The reviewers, however, suggested only minor corrections. They used remarkably similar language to one another, giving the impression that the peer review process had been compromised. Yet the Editor-in-Chief, Michael B. Blank, accepted the papers after minor revisions, with a letter concluding: “Thank you for your fine contribution”. 

There are two hypotheses to consider when a journal publishes incomprehensible or trivial material: either the editor was not doing their job of scrutinising material in the journal, or they were in cahoots with a paper mill. I wondered whether the editor was what I have previously termed an automaton – one who just delegates all the work to a secretary. After all, authors are asked to recommend reviewers, so all that is needed is for someone to send out automated requests to review, and then keep going until there are sufficient recommendations to either accept or reject. If that were the case, then maybe the journal would accept a paper by us. Accordingly, we submitted our manuscript about paper mills to the Journal of Community Psychology. But it was desk rejected by the Editor in Chief with a terse comment: “This a weak paper based on a cursory review of six publications”. So we can reject hypothesis 1 – that the editor is an automaton. But that leaves hypothesis 2 – that the editor does read papers submitted to his journal, and had accepted the previous paper mill outputs in full knowledge of their content. This raises more questions than it answers. In particular, why would he risk his personal reputation and that of his journal by behaving that way? But perhaps rather than dwelling on that question, we should think positively about how journals might protect themselves in future from attacks by paper mills.

A call for action

My proposal is that, in addition to the useful suggestions from the COPE&STM report, we need additional steps to ensure that those with editorial responsibility are legitimate and are doing their job. Here are some preliminary suggestions:

  1. Appointment to the post of editor should be made in open competition among academics who meet specified criteria.
  2. It should be transparent who is responsible for final sign-off for each article that is published in the journal.
  3. Journals where a single editor makes the bulk of editorial decisions should be discouraged. (N.B. I looked at the 20 most recent papers in Journal of Community Psychology that featured on Publons and all had been processed by Michael B. Blank).
  4. There should be an editorial board consisting of reputable people from a wide range of institutional backgrounds, who share the editorial load, and meet regularly to consider how the journal is progressing and to discuss journal business.
  5.  Editors should be warned about the dangers of special issues and should not delegate responsibility for signing off on any papers appearing in a special issue.
  6. Editors should be required to follow COPE guidelines about publishing in their own journal, and publishers should scrutinise the journal annually to check whether the recommended procedures were followed.
  7. Any editor who allows gibberish to be published in their journal should be relieved of their editorial position immediately.

Many journals run by academic societies already adopt procedures similar to these. Particular problems arise when publishers start up new journals to fill a perceived gap in the market, and there is no oversight by academics with expertise in the area. The COPE&STM report has illustrated how dangerous that can be – both for scientific progress and for the reputation of publishers.

Of course, when one looks at this list of requirements, one may start to wonder why anyone would want to be an editor. Typically there is little financial renumeration, and the work is often done in a person’s “spare time”. So maybe we need to rethink how that works, so that paragons with a genuine ability and interest in editing are rewarded more adequately for the important work they do.

P.S. Comment moderation is enabled for this blog to prevent it being overwhelmed by spam, but I welcome comments, and will check for these in the weeks following the post, and admit those that are on topic. 

 

Comment by Jennifer Byrne, 9th Sept 2022 

(this comment by email, as Blogger seems to eat comments by Jennifer for some reason, while letting through weird spammy things!).

This is a fantastic list of suggestions to improve the important contributions of journal editors. I would add that journal editors should be appointed for defined time periods, and their contributions regularly reviewed. If for any reason it becomes apparent that an editor is not in a position to actively contribute to the journal, they should be asked to step aside. In my experience, editorial boards can include numerous inactive editors. These can provide the appearance of a large, active and diverse editorial board, when in practice, the editorial work may be conducted by a much smaller group, or even one person. Journals cannot be run successfully without a strong editorial team, but such teams require time and resources to establish and maintain.

Tuesday, 9 August 2022

Can systematic reviews help clean up science?

 

The systematic review was not turning out as Lorna had expected

Why do people take the risk of publishing fraudulent papers, when it is easy to detect the fraud? One answer is that they don’t expect to be caught. A consequence of the growth in systematic reviews is that this assumption may no longer be safe. 

In June I participated in a symposium organised by the LMU Open Science Center in Munich entitled “How paper mills publish fake science industrial-style – is there really a problem and how does it work?” The presentations are available here. I focused on the weird phenomenon of papers containing “tortured phrases”, briefly reviewed here. For a fuller account see here. These are fakes that are easy to detect, because, in the course of trying to circumvent plagiarism detection software, they change words, with often unintentionally hilarious consequences. For instance, “breast cancer” becomes “bosom peril” and “random value” becomes “irregular esteem”. Most of these papers make no sense at all – they may include recycled figures from other papers. They are typically highly technical and so to someone without expertise in the area they may seem valid, but anyone familiar with the area will realise that someone who writes “flag to commotion” instead of “signal to noise” is a hoaxer. 

Speakers at the symposium drew attention to other kinds of paper mill whose output is less conspicuously weird. Jennifer Byrne documented industrial-scale research fraud in papers on single gene analyses that were created by templates, and which purported to provide data on under-studied genes in human cancer models. Even an expert in the field may be hoodwinked by these. I addressed the question of “does it matter?” For the nonsense papers generated using tortured phrases, it could be argued that it doesn’t, because nobody will try to build on that research. But there are still victims: authors of these fraudulent papers may outcompete other, honest scientists for jobs and promotion, journals and publishers will suffer reputational damage, and public trust in science is harmed. But what intrigued me was that the authors of these papers may also be regarded as victims, because they will have on public record a paper that is evidently fraudulent. It seems that either they are unaware of just how crazy the paper appears, or that they assume nobody will read it anyway. 

The latter assumption may have been true a couple of decades ago, but with the growth of systematic reviews, researchers are scrutinizing many papers that previously would have been ignored. I was chatting with John Loadsman, who in his role as editor of Anaesthesia and Intensive Care has uncovered numerous cases of fraud. He observed that many paper mill outputs never get read because, just on the basis of the title or abstract, they appear trivial or uninteresting. However, when you do a systematic review, you are supposed to read everything relevant to the research question, and evaluate it, so these odd papers may come to light. 

I’ve previously blogged about the importance of systematic reviews for avoiding cherrypicking of the literature. Of course, evaluation of papers is often done poorly or not at all, in which case the fraudulent papers just pollute the literature when added to a meta-analysis. But I’m intrigued at the idea that systematic reviews might also serve the purpose of putting the spotlight on dodgy science in general, and fraudsters in particular, by forcing us to read things thoroughly. I therefore asked Twitter for examples – I asked specifically about meta-analysis but the responses covered systematic reviews more broadly, and were wide-ranging both in the types of issue that were uncovered and the subject areas. 

Twitter did not disappoint: I received numerous examples – more than I can include here. Much of what was described did not sound like the work of paper mills, but did include fraudulent data manipulation, plagiarism, duplication of data in different papers, and analytic errors. Here are some examples: 

Paper mills and template papers

Jennifer Byrne noted how she became aware of paper mills when looking for studies of a particular gene she was interested in, which was generally under-researched. Two things raised her suspicions: a sudden spike in studies of the gene, plus series of papers that had the same structure, as if constructed from a template. Subsequently, with Cyril LabbĂ©, who developed an automated Seek & Blastn tool to assess nucleotide sequences, she found numerous errors in the reagents and specification of genetic sequences of these repetitive papers, and it became clear that they were fraudulent. 

An example of a systematic review that discovered a startling level of inadequate and possibly fraudulent research was focused on the effect of tranexamic acid on post-partum haemorrhage: out of 26 reports, eight had sections of identical or very similar text, despite apparently coming from different trials. This is similar to what has been described for papers from paper mills, which are constructed from a template. And, as might be expected for a paper mill output, there were also numerous statistical and methodological errors, and some cases without ethical approval. (Thanks to @jd_wilko for pointing me to this example). 

Plagiarism 

Back in 2006, Iain Chalmers, who is generally ahead of his time, noted that systematic reviews could root out cases of plagiarism, citing the example of Asim Kurjak, whose paper on epidural analgesia in labour was heavily plagiarised

Data duplication 

Meta-analysis can throw up cases where the same study is reported in two or more papers, with no indication that this is the same data. Although this might seem like a minor problem compared with fraud, it can be serious, because if the duplication is missed in a meta-analysis, that study will be given more weight than it should have. Ioana Cristea noted that such ‘zombie papers’ have cropped up in a meta-analysis she is currently analysing. 

Tampering with peer review 

When a paper considered for a meta-analysis seems dubious, it raises the question of whether proper peer review procedures were followed. It helps if the journal adopts open peer review. Robin N. Kok reported a paper where the same person was listed as an author and a peer reviewer. This was eventually retracted.  

Data seem too good to be true 

This piece in Science tells the story of Qian Zhang, who published a series of studies on impact of cartoon violence in children which on the one hand had remarkably large samples of children all at the same age, and on the other hand had similar samples across apparently different studies.  Because of their enormous size, Zhang’s papers distorted any meta-analysis they were included in. 

Aaron Charlton cited another case, where serious anomalies were picked up in a study on marketing in the course of a meta-analysis. The paper was ultimately retracted 3 years after the concerns were raised, after defensive responses from some of the authors, challenging the meta-analysts. 

This case flagged by Neil O’Connell is especially useful, as it documents a range of methods used to evaluate suspect research. The dodgy work was first flagged up in a meta-analysis of cognitive behaviour therapy for chronic pain.  Three papers with the same lead author, M. Monticone, obtained results that were discrepant with the rest of the literature, with much bigger effect sizes. The meta-analysts then looked at other trials by the same team and found that there was a 6-fold difference between the lower confidence interval of the Monticone studies and the upper confidence interval of all others combined. The paper also reports email exchanges with Dr Monticone that may be of interest to readers. 

Poor methodology 

Fiona Ramage told me that in the course of doing a preclinical systematic review and meta-analysis of nutritional neuroscience, she encountered numerous errors of basic methodology and statistics, e.g. dozens of papers where error bars were presented without indicating if they show SE or SD; studies claiming differences between groups without a direct statistical comparison. This is more likely to be due to ignorance or honest error than to malpractice, but it needs to be flagged up so that the literature is not polluted by erroneous data.

What are the consequences?

Of course, the potential of systematic reviews to detect bad science is only realised if the dodgy papers are indeed weeded out of the literature, and people who commit scientific fraud are fired. Journals and publishers have started to respond to paper mills, but, as Ivan Oransky has commented, this is a game of Whac-a-Mole, and "the process of retracting a paper remains comically clumsy, slow and opaque”. 

I was surprised that even when confronted with an obvious case of a paper that had both numerous tortured phrases and plagiarism, the response from the publisher was slow – e.g. this comically worded example is still not retracted, even though the publisher’s research integrity office acknowledged my email expressing concern over 2 months ago.  But 2 months is nothing. Guillaume Cabanac recently tweeted about a "barn door" case of plagiarism that has just been retracted 20 years after it was first flagged up.  When I discuss the slow responses to concerns with publishers, they invariably say that they are being kept very busy with a huge volume of material from paper mills. To which I answer, you are making immense profits, so perhaps some could be channeled into employing more people to tackle this problem. As I am fond of pointing out, I regard a publisher who leaves seriously problematic studies in the literature as analogous to a restauranteur that serves poisoned food to customers. 

Publishers may be responsible for correcting the scientific record, but it is institutional employers who need to deal with those who commit malpractice. Many institutions don’t seem to take fraud seriously. This point was made back in 2006 by Iain Chalmers, who described the lenient treatment of Asim Kurjak, and argued for public naming and shaming of those who are found guilty of scientific misconduct. Unfortunately, there’s not much evidence that his advice has been heeded. Consider this recent example of a director of a primate reseach lab who admitted fraud, but is still in post. (Here the fraud was highlighted by a whistleblower rather than a systematic review, but this illustrates the difficulty of tackling fraud when there are only minor consequences for fraudsters). 

Could a move towards "slow science" help? In the humanities, literary scholars pride themselves on “close reading” of texts. In science, we are often so focused on speed and concision, that we tend to lose the ability to focus deeply on a text, especially if it is boring. The practice of doing a systematic review should in principle develop better skills in evaluation of individual papers, and in so doing help cleanse the literature from papers that should never have got published in the first place. John Loadsman has suggested we should not only read papers carefully, but should recalibrate ourselves to have a very high “index of suspicion” rather than embracing the default assumption that everyone is honest. 

P.S

Many thanks to everyone who sent in examples. Sorry I could not include everything. Please feel free to add other examples or reactions in the Comments – these tend to get overwhelmed with adverts for penis enlargement or (ironically) essay mills, and so are moderated, but I do check them and relevant comments will eventually appear.

PPS. Florian Naudet sent a couple of relevant links that readers might enjoy: 

Fascinating article by Fanelli et al who looked at how inclusion of retracted papers affected meta-analyses: https://www.tandfonline.com/doi/full/10.1080/08989621.2021.1947810  

And this piece by Lawrence et al shows the dangers of meta-analyses when there is insufficient scrutiny of the papers that are included: https://www.nature.com/articles/s41591-021-01535-y  

Also, Joseph Lee tweeted about this paper about inclusion of papers from predatory publications in meta-analyses: https://jmla.pitt.edu/ojs/jmla/article/view/491 

PPPS. 11th August 2022

A couple of days after posting this, I received a copy of "Systematic Reviews in Health Research" edited by Egger, Higgins and Davey Smith. Needless to say, the first thing I did was to look up "fraud" in the index. Although there are only a couple of pages on this, the examples are striking. 

First, a study by Nowbar et al (2014) on bone marrow stem cells for heart disease found that in a review of 133 reports, over 600 discrepancies were found, and the number of discrepancies increased with the reported effect size. There's a trail of comments on Pubpeer relating to some of the sources, e.g. https://pubpeer.com/publications/B346354468C121A468D30FDA0E295E.

Another example concerns the use of beta-blockers during surgery. A series of studies from one centre (the DECREASE trials) showing good evidence of effectiveness was investigated and found to be inadequate, with missing data and failure to follow research protocols. When these studies were omitted from a meta-analysis, the conclusion was that, far from receiving benefit from beta-blockers, patients in the treatment group were more likely to die (Bouri et al, 2014). 

 PPPPS, 18th August 2022

This comment by Jennifer Byrne was blocked by Blogger - possibly because it contained weblinks.

Anyhow, here is what she said:

I agree, reading both widely and deeply can help to identify problematic papers, and an ideal time for this to happen is when authors are writing either narrative or systematic reviews. Here's another two examples where Prof Carlo Galli and colleagues identified similar papers that may have been based on templates: https://www.mdpi.com/2304-6775/7/4/67, https://link.springer.com/article/10.1007/s11192-022-04434-2 

 




Wednesday, 3 August 2022

Contagion of the political system


 

Citizens of the UK have in recent weeks watched in amazement as the current candidates for leadership of the Conservative party debate their policies. Whoever wins will replace Boris Johnson as Prime Minister, with the decision made by a few thousand members of the Conservative Party. All options were bad, and we are now down to the last two: Liz Truss and Rishi Sunak.

 

For those of us who are not Conservatives, and for many who are, there was immense joy at the ousting of Boris Johnson. The man seemed like a limpet, impossible to dislodge. Every week brought a new scandal that would have been more than sufficient to lead to resignation 10 years ago, yet he hung on and on. Many people thought that, after a vote of no confidence in his leadership, he would step down so that a caretaker PM could run the country while the debate over his successor took place, but the limpet is still clinging on. He’s not doing much running of the country, but that’s normal, and perhaps for the best. He’s much better at running parties than leading the Conservative party.

 

I have to say I had not expected much from Truss and Sunak, but even my low expectations have not been met. The country is facing immense challenges, from climate change, from coronavirus, and from escalating energy prices. These are barely mentioned: instead the focus is on reducing taxes, with the candidates now competing for just how much tax they can cut. As far as I can see, these policies will do nothing to help the poorest in society, whose benefits will shrink to pay for tax cuts; the richer you are the more tax you pay and so this is a rich person’s policy.

 

What has surprised me is just how ill-informed the two candidates are. The strategy seems to be to pick a niche topic of interest to Conservative voters, make up a new policy overnight and announce it the next day. So we have Rishi Sunak proposing that the solution to the crisis in the NHS is to charge people who miss doctor’s appointments. Has he thought this through? Think of the paperwork. Think of the debt collectors tasked with collecting £10 from a person with dementia. Think of the cost of all of this.  And on Education, his idea is to reintroduce selective (grammar) schools: presumably because he thinks that our regular schools are inadequate to educate intelligent children.

 

On Education, Liz Truss is even worse. Her idea is that all children who score top marks in their final year school examinations should get an interview to go to Oxford or Cambridge University. This is such a crazy idea that others have written at length to point out its flaws (e.g. this cogent analysis by Sam Freedman). Suffice it to say that it has a similar vibe to the Sunak grammar schools plan: it implies that only two British universities have any value. Conservatives do seem obsessed with creating divisions between haves and have-nots, but only if they can ensure their children are among the haves.

 

Another confused statement from Truss is that, as far as Scotland goes, she plans to ignore Nicola Sturgeon, the First Minister of Scotland and leader of the Scottish National Party. At a time when relationships between Scotland and England are particularly fraught, this insensitive statement is reminiscent of the gaffes of Boris Johnson.

 

Oh, and yesterday she also announced – and then quickly U-turned – an idea that would limit the pay of public sector workers in the North of England, because it was cheaper to live there.

 

What I find so odd about both Sunak and Truss is that they keep scoring own goals. Nobody requires them to keep coming up with new policies in niche areas.  Why don’t they just hold on to their original positions, and if asked about anything else, just agree to ‘look at’ it when in power? Johnson was always very good at promising to ‘look at’ things: when he’s not being a limpet, he’s a basilisk. The more you probe Sunak and Truss, the more their shallowness and lack of expertise show through. They’d do well to keep schtum. Or, better still, show some indication that they could, for instance, get a grip on the crisis in the NHS.

 

What all this demonstrates is how an incompetent and self-promoting leader causes damage far beyond their own term. Johnson’s cabinet was selected purely on one criterion: loyalty to him. The first requirement was to “believe in Brexit” – reminiscent of the historical wars between Protestants and Catholics, where the first thing you ask of a candidate is what their religion is. Among Conservative politicians, it seems that an accusation of not really being a Brexiteer is the worst thing you can say about a candidate. Indeed, that is exactly the charge that her opponents level against Truss, who made cogent arguments for remaining in the EU before the referendum. Like a Protestant required to recant their beliefs or face the flames, she is now reduced to defending Brexit in the strongest possible terms, saying that “predictions of doom have not come true”, as farmers, fishermen, and exporters go out of business, academics leave in droves, and holidaymakers sit in queues at Dover.

 

It's known that Johnson does not want to give up the top job. I’m starting to wonder if behind all of this is a cunning plan. The people he’s appointed to cabinet are so incompetent that maybe he hopes that, when confronted with a choice between them, the Conservative Party will decide that he looks better than either of them.