Saturday, 11 July 2015

Publishing replication failures: some lessons from history

I recently travelled to Lismore, Ireland, to speak at the annual Robert Boyle summer school. I had been intrigued by the invitation, as it was clear this was not the usual kind of scientific meeting. The theme of Robert Boyle, who was born in Lismore Castle, was approached from very different angles, and those attending included historians of science, scientists, journalists, as well as interested members of the public. We were treated to reconstructions of some of Boyle's livelier experiments, heard wonderful Irish music, and we celebrated the installation of a plaque at Lismore Castle to honour Katherine Jones, Boyle's remarkable sister, who was also a scientist.

My talk was on the future of scientific scholarly publication, a topic that the Royal Society had explored in a series of meetings to celebrate the 350th Anniversary of the publication of Philosophical Transactions. I'm particularly interested in the extent to which current publishing culture discourages good science, and I concluded by proposing the kind of model that I recently blogged about, where the traditional science journal is no longer relevant to communicating science.

What I hadn't anticipated was the relevance of some of Boyle's writing to such contemporary themes.

Boyle, of course, didn't have to grapple with issues such as the Journal Impact Factor or Open Access payments. But some of the topics he covered are remarkably contemporary. He would have been interested in the views of Jason Mitchell, John L. Loeb Associate Professor of the Social Sciences at Harvard, who created a stir last year by writing a piece entitled "On the emptiness of failed replications". I see that the essay has now been removed from the Harvard website, but the main points can be found here*. It was initially thought to be a parody, but it seems to have been a sincere attempt at defending the thesis that "unsuccessful experiments have no meaningful scientific value." Furthermore, according to Mitchell, "Whether they mean to or not, authors and editors of failed replications are publicly impugning the scientific integrity of their colleagues." I have taken issue with this standpoint in an earlier blogpost; my view is that we should not assume that a failure to replicate a result is due to fraud or malpractice, but rather should encourage replication attempts as a means of establishing which results are reproducible.

I am most grateful to Eoin Gill of Calmast for pointing me to Robert Boyle's writings on this topic, and for sending me transcripts of the most relevant bits. Boyle has two essays on "the Unsuccessfulness of Experiments" in a collection of papers entitled “Certain Physiological Essays and other Tracts”. In these he discusses (at inordinate length!) the problems that arise when an experimental result fails to replicate. He starts by noting that such unsuccessful experiments are not uncommon:
… in the serious and effectual prosecution of Experimental Philosophy, I must add one discouragement more, which will perhaps as much surprize you as dishearten you; and it is, That besides that you will find …… many of the Experiments publish'd by Authors, or related to you by the persons you converse with, false or unsuccessful, … you will meet with several Observations and Experiments, which though communicated for true by Candid Authors or undistrusted Eye-witnesses, or perhaps recommended to you by your own experience, may upon further tryal disappoint your expectation, either not at all succeeding constantly, or at least varying much from what you expected. (opening passage)
He is interested in exploring the reasons for such failure; his first explanation seems equivalent to one that those using statistical analyses are all too familiar with – a chance false positive result.
And that if you should have the luck to make an Experiment once, without being able to perform the same thing again, you might be apt to look upon such disappointments as the effects of an unfriendliness in Nature or Fortune to your particular attempts, as proceed but from a secret contingency incident to some experiments, by whomsoever they be tryed. (p. 44)
And he urges the reader not to be discouraged – replication failures happen to everyone!
…. though some of your Experiments should not always prove constant, you have divers Partners in that infelicity, who have not been discouraged by it. (p. 44)
He identifies various possible systematic reasons for such failure: a problem with skill of the experimenter, with purity of ingredients, or variation in the specific context in which the experiment is conducted. He even, implicitly, addresses statistical power, noting how one needs many observations to distinguish what is general from individual variation.
…the great variety in the number, magnitude, position, figure, &c. of the parts taken notice of by Anatomical Writers in their dissections of that one Subject the humane body, about which many errors would have been delivered by Anatomists, if the frequency of dissections had not enabled them to discern betwixt those things that are generally and uniformly found in dissected bodies, and those which are but rarely, and (if I may so speak) through some wantonness or other deviation of Nature, to be met with. (p. 94)
Because of such uncertainties, Boyle emphasises the need for replication, and the dangers of building complex theory on the basis of a single experiment:
….try those Experiments very carefully, and more than once, upon which you mean to build considerable Superstructures either theorical or practical, and to think it unsafe to rely too much upon single Experiments, especially when you have to deal in Minerals: for many to their ruine have found, that what they at first look'd upon as a happy Mineral Experiment has prov'd in the issue the most unfortunate they ever made. (p. 106)
I'm sure there are some modern scientists who must be thinking their lives may have been made much easier if they had heeded this advice. But perhaps the most relevant to the modern world, where there is such concern about the consequences of failure to replicate, are Boyle's comments on the reputational impact of publishing irreproducible results:
…if an Author that is wont to deliver things upon his own knowledge, and shews himself careful not to be deceived, and unwilling to deceive his Readers, shall deliver any thing as having try'd or seen it, which yet agrees not with our tryals of it; I think it but a piece of Equity, becoming both a Christian and a Philosopher, to think (unless we have some manifest reason to the contrary) that he set down his Experiment or Observation as he made it, though for some latent reason it does not constantly hold; and that therefore though his Experiment be not to be rely'd upon, yet his sincerity is not to be rejected. Nay, if the Author be such an one as has intentionally and really deserved well of Mankind, for my part I can be so grateful to him, as not only to forbear to distrust his Veracity, as if he had not done or seen what he says he did or saw, but to forbear to reject his Experiments, till I have tryed whether or no by some change of Circumstances they may not be brought to succeed. (p. 107)
The importance of fostering a 'no blame' culture was one theme that emerged in a recent meeting on Reproducibility and Reliability of Biomedical Research at the Academy of Medical Sciences. It seems that in this, as in so many other aspects of science, Boyle's views are well-suited to the 21st century.

For more on Robert Boyle, see here

12th July 2015: Thanks to DaniĆ«l Lakens who pointed me to the Wayback machine, where earlier versions of the article can be found:*/