In
a previous blogpost, I criticised a recent paper claiming that playing action
video games improved reading in dyslexics. In a series of comments below the
blogpost, two of the authors, Andrea Facoetti and Simone Gori, have responded
to my criticisms. I thank them for taking the trouble to spell out their views
and giving readers the opportunity to see another point of view. I am, however,
not persuaded by their arguments, which make two main points. First, that their
study was not methodologically weak and so Current Biology was right to publish
it, and second, that it is unfair, and indeed unethical, to criticise a
scientific paper in a blog, rather than through the regular scientific
channels.
Regarding the study
methodology, as noted above, the principal problem with the study by
Franceschini et al was that it was underpowered, with just 10 participants per
group. The authors reply with an
argument ad populum, i.e. many other studies have used equally small samples.
This is undoubtedly true, but it doesn’t make it right. They dismiss the paper
I cited by Christley (2010) on the grounds that it was published in a low
impact journal. But the serious drawbacks of underpowered studies have been
known about for years, and written about in high- as well as low-impact
journals (see references below).
The response by Facoetti
and Gori illustrates the problem I had highlighted. In effect, they are saying
that we should believe their result because it appeared in a high-impact
journal, and now that it is published, the onus must be on other people to
demonstrate that it is wrong. I can appreciate that it must be deeply
irritating for them to have me expressing doubt about the replicability of
their result, given that their paper passed peer review in a major journal and
the results reach conventional levels of statistical significance. But in the
field of clinical trials, the non-replicability of large initial effects from
small trials has been demonstrated on numerous occasions, using empirical data
- see in particular the work of Ioannidis, referenced below. The reasons for
this ‘winner’s curse’ have been much discussed, but its reality is not in
doubt. This is why I maintain that the paper would not have been published if
it had been reviewed by scientists who had expertise in clinical trials
methodology. They would have demanded more evidence than this.
The response by the
authors highlights another issue: now that the paper has been published, the
expectation is that anyone who has doubts, such as me, should be responsible
for checking the veracity of the findings. As we say in Britain, I should put
up or shut up. Indeed, I could try to get a research grant to do a further
study. However, I would probably not be allowed by my local ethics committee to
do one on such a small sample and it might take a year or so to do, and would
distract me from my other research. Given that I have reservations about the
likelihood of a positive result, this is not an attractive option. My view is
that journal editors should have recognised this as a pilot study and asked the
authors to do a more extensive replication, rather than dashing into print on
the basis of such slender evidence. In publishing this study, Current Biology
has created a situation where other scientists must now spend time and
resources to establish whether the results hold up.
To establish just how
damaging this can be, consider the case of the FastForword intervention,
developed on the basis of a small trial initially reported in Science in 1996.
After the Science paper, the authors went directly into commercialization of
the intervention, and reported only uncontrolled trials. It took until 2010 for
there to be enough reasonably-sized independent randomized controlled trials to
evaluate the intervention properly in a meta-analysis, at which point it was
concluded that it had no beneficial effect. By this time, tens of thousands of
children had been through the intervention, and hundreds of thousands of
research dollars had been spent on studies evaluating FastForword.
I appreciate that those
reporting exciting findings from small trials are motivated by the best of
intentions – to tell the world about something that seems to help children. But
the reality is that, if the initial trial is not adequately powered, it can be
detrimental both to science and to the children it is designed to help, by
giving such an imprecise and uncertain estimate of the effectiveness of
treatment.
Finally, a comment on
whether it is fair to comment on a research article in a blog, rather than
going through the usual procedure of submitting an article to a journal and
having it peer-reviewed prior to publication. The authors’ reactions to my
blogpost are reminiscent of Felicia Wolfe-Simon’s response to blog-based
criticisms of a paper she published in Science: "The items you are
presenting do not represent the proper way to engage in a scientific
discourse”. Unlike Wolfe-Simon, who simply refused to engage with bloggers,
Facoetti and Gori show willingness to discuss matters further, and present
their side of the story, but they nevertheless it is clear they do not regard a
blog as an appropriate place to debate scientific studies.
I could not disagree
more. As was readily demonstrated in the Wolfe-Simon case, what has come to be
known as ‘post-publication peer review’ via the blogosphere can allow for new
research to be rapidly discussed and debated in a way that would be quite
impossible via traditional journal publishing. In addition, it brings the
debate to the attention of a much wider readership. Facoetti and Gori feel I
have picked on them unfairly: in fact, I found out about their paper because I
was asked for my opinion by practitioners who worked with dyslexic children.
They felt the results from the Current Biology study sounded too good to be
true, but they could not access the paper from behind its paywall, and in any
case they felt unable to evaluate it properly. I don’t enjoy criticising
colleagues, but I feel that it is entirely proper for me to put my opinion out
in the public domain, so that this broader readership can hear a different
perspective from those put out in the press releases. And the value of blogging
is that it does allow for immediate reaction, both positive and negative. I
don’t censor comments, provided they are polite and on-topic, so my readers
have the opportunity to read the reaction of Facoetti and Gori.
I should emphasise that I
do not have any personal axe to grind with the study's authors, who I do not
know personally. I’d be happy to revise my opinion if convincing arguments are
put forward, but I think it is important that this discussion takes place in
the public domain, because the issues it raises go well beyond this specific
study.
References
Button, K. S., Ioannidis,
J. P. A., Mokrysz, C., Nosek, B. A., Flint, J., Robinson, E. S. J., &
Munafo, M. R. (2013). Power failure: why small sample size undermines the
reliability of neuroscience. Nature Reviews Neuroscience, advance online publication.
doi: 10.1038/nrn3475
Ioannidis, J. P. A. (2005).
Why most published research findings are false. PLoS Medicine, 2(8), e124. doi:
10.1371/journal.pmed.0020124
Ioannidis, J. P. (2008).
Why most discovered true associations are inflated. Epidemiology 19(5),
640-648.
Ioannidis JP, Pereira TV,
& Horwitz RI (2013). Emergence of large treatment effects from small
trials--reply. JAMA : the journal of the American Medical Association, 309 (8),
768-9 PMID: 23443435