Thursday, 21 March 2013
Blogging as post-publication peer review: reasonable or unfair?
In a previous blogpost, I criticised a recent paper claiming that playing action video games improved reading in dyslexics. In a series of comments below the blogpost, two of the authors, Andrea Facoetti and Simone Gori, have responded to my criticisms. I thank them for taking the trouble to spell out their views and giving readers the opportunity to see another point of view. I am, however, not persuaded by their arguments, which make two main points. First, that their study was not methodologically weak and so Current Biology was right to publish it, and second, that it is unfair, and indeed unethical, to criticise a scientific paper in a blog, rather than through the regular scientific channels.
Regarding the study methodology, as noted above, the principal problem with the study by Franceschini et al was that it was underpowered, with just 10 participants per group. The authors reply with an argument ad populum, i.e. many other studies have used equally small samples. This is undoubtedly true, but it doesn’t make it right. They dismiss the paper I cited by Christley (2010) on the grounds that it was published in a low impact journal. But the serious drawbacks of underpowered studies have been known about for years, and written about in high- as well as low-impact journals (see references below).
The response by Facoetti and Gori illustrates the problem I had highlighted. In effect, they are saying that we should believe their result because it appeared in a high-impact journal, and now that it is published, the onus must be on other people to demonstrate that it is wrong. I can appreciate that it must be deeply irritating for them to have me expressing doubt about the replicability of their result, given that their paper passed peer review in a major journal and the results reach conventional levels of statistical significance. But in the field of clinical trials, the non-replicability of large initial effects from small trials has been demonstrated on numerous occasions, using empirical data - see in particular the work of Ioannidis, referenced below. The reasons for this ‘winner’s curse’ have been much discussed, but its reality is not in doubt. This is why I maintain that the paper would not have been published if it had been reviewed by scientists who had expertise in clinical trials methodology. They would have demanded more evidence than this.
The response by the authors highlights another issue: now that the paper has been published, the expectation is that anyone who has doubts, such as me, should be responsible for checking the veracity of the findings. As we say in Britain, I should put up or shut up. Indeed, I could try to get a research grant to do a further study. However, I would probably not be allowed by my local ethics committee to do one on such a small sample and it might take a year or so to do, and would distract me from my other research. Given that I have reservations about the likelihood of a positive result, this is not an attractive option. My view is that journal editors should have recognised this as a pilot study and asked the authors to do a more extensive replication, rather than dashing into print on the basis of such slender evidence. In publishing this study, Current Biology has created a situation where other scientists must now spend time and resources to establish whether the results hold up.
To establish just how damaging this can be, consider the case of the FastForword intervention, developed on the basis of a small trial initially reported in Science in 1996. After the Science paper, the authors went directly into commercialization of the intervention, and reported only uncontrolled trials. It took until 2010 for there to be enough reasonably-sized independent randomized controlled trials to evaluate the intervention properly in a meta-analysis, at which point it was concluded that it had no beneficial effect. By this time, tens of thousands of children had been through the intervention, and hundreds of thousands of research dollars had been spent on studies evaluating FastForword.
I appreciate that those reporting exciting findings from small trials are motivated by the best of intentions – to tell the world about something that seems to help children. But the reality is that, if the initial trial is not adequately powered, it can be detrimental both to science and to the children it is designed to help, by giving such an imprecise and uncertain estimate of the effectiveness of treatment.
Finally, a comment on whether it is fair to comment on a research article in a blog, rather than going through the usual procedure of submitting an article to a journal and having it peer-reviewed prior to publication. The authors’ reactions to my blogpost are reminiscent of Felicia Wolfe-Simon’s response to blog-based criticisms of a paper she published in Science: "The items you are presenting do not represent the proper way to engage in a scientific discourse”. Unlike Wolfe-Simon, who simply refused to engage with bloggers, Facoetti and Gori show willingness to discuss matters further, and present their side of the story, but they nevertheless it is clear they do not regard a blog as an appropriate place to debate scientific studies.
I could not disagree more. As was readily demonstrated in the Wolfe-Simon case, what has come to be known as ‘post-publication peer review’ via the blogosphere can allow for new research to be rapidly discussed and debated in a way that would be quite impossible via traditional journal publishing. In addition, it brings the debate to the attention of a much wider readership. Facoetti and Gori feel I have picked on them unfairly: in fact, I found out about their paper because I was asked for my opinion by practitioners who worked with dyslexic children. They felt the results from the Current Biology study sounded too good to be true, but they could not access the paper from behind its paywall, and in any case they felt unable to evaluate it properly. I don’t enjoy criticising colleagues, but I feel that it is entirely proper for me to put my opinion out in the public domain, so that this broader readership can hear a different perspective from those put out in the press releases. And the value of blogging is that it does allow for immediate reaction, both positive and negative. I don’t censor comments, provided they are polite and on-topic, so my readers have the opportunity to read the reaction of Facoetti and Gori.
I should emphasise that I do not have any personal axe to grind with the study's authors, who I do not know personally. I’d be happy to revise my opinion if convincing arguments are put forward, but I think it is important that this discussion takes place in the public domain, because the issues it raises go well beyond this specific study.
Button, K. S., Ioannidis, J. P. A., Mokrysz, C., Nosek, B. A., Flint, J., Robinson, E. S. J., & Munafo, M. R. (2013). Power failure: why small sample size undermines the reliability of neuroscience. Nature Reviews Neuroscience, advance online publication. doi: 10.1038/nrn3475
Ioannidis, J. P. A. (2005). Why most published research findings are false. PLoS Medicine, 2(8), e124. doi: 10.1371/journal.pmed.0020124
Ioannidis, J. P. (2008). Why most discovered true associations are inflated. Epidemiology 19(5), 640-648.
Ioannidis JP, Pereira TV, & Horwitz RI (2013). Emergence of large treatment effects from small trials--reply. JAMA : the journal of the American Medical Association, 309 (8), 768-9 PMID: 23443435