Thursday, 19 January 2012

Novelty, interest and replicability


So at last, your paper is written. It represents the culmination of many years’ work. You think is an important advance for the field. You write it up. You carefully format it for your favoured journal. You grapple with the journal’s portal, tracking down details of recommended reviewers and then sit back. You anticipate a delay of a few weeks before you get reviewer comments. But, no. What’s this? A decision letter within a week: “Unfortunately we receive many more papers than we can publish or indeed review and must make difficult decisions on the basis of novelty and general interest as well as technical correctness.” It’s the publishing equivalent of the grim reaper: a reject without review.

It happens increasingly often, especially if you send work to journals with high impact factors. I’ve been an editor and I know there are difficult decisions to make. It can be kinder to an author to reject immediately if you sense that the paper isn’t going to make it through the review process. One thing you learn as an author is that there’s no point protesting or moaning. You just try again with another journal. I’m confident our paper is important and will get published, and there’s no reason for me to single this journal out for complaint. But this experience has made me reflect more generally on factors affecting publication, and I do think there are things about the system that are problematic.

So, using this blog as my soapbox, there are two points I’d like to make: A little one and a big one. Let’s get the little one out of the way first. It’s simply this: if a journal commonly rejects papers without review, then it shouldn’t be fussy about the format in which a paper is submitted. It’s just silly for busy people to spend time getting the references correctly punctuated, or converting their figures to a specific format, if there’s a strong probability that their paper will be bounced. Let the formatting issues be addressed after the first round of review.
The second point concerns the criteria of “novelty and general interest”. My guess is that our paper was triaged on the novelty criterion because it involved replication. We reported a study that involved measuring electrical brain responses to sounds. We compared these responses in children with developmental language impairments and typically-developing children. The rationale is explained in a blogpost I wrote for the Wellcome Trust.
We’re not the first people to do this kind of research. There have been a few previous studies, but it’s a fair summary to say the literature is messy. I reviewed part of it a few years back and I was shocked at how bad things were. It was virtually impossible to draw any general conclusions from 26 studies. Now these studies are really hard to do. Just recruiting people is difficult and it can take months if not years to get an adequate sample. Then there is the data analysis which is not for the innumerate or faint-hearted. So a huge amount of time and money had gone into these studies, but we didn’t seem to be progressing very far. The reason was simple: you couldn’t generalise because nobody ever attempted to replicate previous research. The studies were focussed on the same big questions, but they differed in important ways. So if they got different results, you couldn’t tell why.
In response to this, part of my research strategy has been to take those studies that look the strongest and attempt to replicate them. So when we found strikingly similar results to a study by Shafer et al (2010) I was excited. The fact that two independent labs on different sides of the world had obtained virtually the same result gave me confidence in the findings. I was able to build on this result to do some novel analyses that helped establish direction of causal influences, and felt we at last we were getting somewhere. But my excitement was clearly not shared by the journal editor, who no doubt felt our findings were not sufficiently novel. I wasn’t particularly surprised by this decision, as this is the way things work. But is the focus on novelty good for science?
The problem is that unless novel findings are replicated, we don’t know which results are solid and reliable. We ought to know: we apply statistical methods with the sole goal of establishing this. But in practice, statistics are seldom used appropriately. People generate complex datasets and then explore different ways of analysing data to find statistically significant results. In electrophysiological studies, there are numerous alternative ways in which data can be analysed, by examining different peaks in a waveform, different methods of identifying peaks, different electrodes, different time windows, and so on. If you do this, it is all too easy for “false positives” to be mistaken as genuine effects (Simmons, Nelson, & Simonsohn, 2011). And the problem is compounded by the “file drawer problem” whereby people don’t publish null results. Such considerations led Ioannidis (2005) to conclude that most published research findings are false.
This is well-recognised in the field of genetics, where it became apparent that most early studies linking genetic variants to phenotypes were spurious (see Flint et al). The reaction, reflected in a recent editorial in Behavior Genetics has been to insist that authors replicate findings of associations between genes and behaviour. So if you want to say something novel, you have to demonstrate the effect in two independent samples.
This is all well and good, but requiring that authors replicate their results is unrealistic in a field where a study takes several years to complete, or involves a rare disorder. You can, however, create an expectation that researchers include a replication of prior work when designing a study, and/or use existing research to generate a priori predictions about expected effects.
It wouldn’t be good for science if journals only published boring replications of things we already knew. Once a finding is established as reliable, then there’s no point in repeating the study. But something that has been demonstrated at least twice in independent samples (replicable) is far more important to science than something that has never been shown before (novel), because the latter is likely to be spurious. I see this as a massive challenge for psychology and neuroscience.
In short, my view is that top journals should reverse their priorities and treat replicability as more important than novelty.
Unfortunately, most scientists don’t bother to attempt replications because they know the work will be hard to publish. We will only reverse that perception if journal editors begin to put emphasis on replicability.
A few individuals are speaking out on this topic. I recommend a blogpost by Brian Knutson who argued, “Replication should be celebrated rather than denigrated.” He suggested that we need a replicability index to complement the H-index. If scientists were rewarded for doing studies that others can replicate, we might see a very different rank ordering of research stars.
I leave the last word to Kent Anderson: “Perhaps we’re measuring the wrong things … Perhaps we should measure how many results have been replicated. Without that, we are pursuing a cacophony of claims, not cultivating a world of harmonious truths.”


Simmons, J., Nelson, L., & Simonsohn, U. (2011). False-Positive Psychology: Undisclosed Flexibility in Data Collection and Analysis Allows Presenting Anything as Significant Psychological Science, 22 (11), 1359-1366 DOI: 10.1177/0956797611417632

16 comments:

  1. "If a journal commonly rejects papers without review, then it shouldn’t be fussy about the format in which a paper is submitted. It’s just silly for busy people to spend time getting the references correctly punctuated, or converting their figures to a specific format, if there’s a strong probability that their paper will be bounced. Let the formatting issues be addressed after the first round of review."

    Yes, yes, a thousand times yes! It is absolutely insane the amount of time that highly trained academics waste on petty formatting issues. I wrote about this a while ago and it still seems unbelievably stupid that when preparing my research output for the world, such a huge proportion of the effort is dedicated to inserting and removing commas.

    "But my excitement was clearly not shared by the journal editor, who no doubt felt our findings were not sufficiently novel. I wasn’t particularly surprised by this decision, as this is the way things work. But is the focus on novelty good for science?"

    Heck, no. Again, my field (palaeontology) is flooded with this kind of thing. Someone does something novel, then that technique is never used by anyone again because everyone's off chasing after the next new technique. *fume*

    ReplyDelete
  2. Sometimes moaning works:
    https://twitter.com/#!/noahWG/status/155033215797698560
    Noah works for Nature...

    This whole reformatting silliness is just yet another (of many) reasons why we should wean ourselves from journals altogether and instead implement a communication system that benefits science and scientists instead of the shareholders of corporate publishers.

    It would indeed be formidable of reproducibility could be implemented in a metric - but it won't happen with our current reputation system.

    ReplyDelete
  3. Yes, another loud "Hear hear" to the reformatting issue. We've just had something accepted by the 4th journal it went to, after 3 'reject without reviews' (though they took notably longer than a week each), and hence 4 (count 'em) fiddling-into-required-format parties. The first journal even made us edit the Ms extensively for length (which took some doing) BEFORE they rejected it without review.

    And of course the multi-panel figures had to be re-edited into preferred formats too.

    I'm relieved to say that it was a younger colleague of mine, as corresponding author, who had to field the brunt of all this... but the amount of their time that went into it, and the thought of what more useful thing(s) they could have been doing with said time, was really infuriating.

    ReplyDelete
  4. Charlie Wilson @crewilson19 January 2012 16:43

    Wouldn't it be nice if the journals adopted a few more policies that went in this direction, and actually helped to support an approach to science where the important thing is to be right and replicable, not flashy. A couple spring to mind:

    First, it would be great if journals had a simple policy that they would automatically publish (i.e. in the same journal) a replication of a novel result they have previously published, if the methodology was found to be correct in review. In other words each issue would have a "replications" section with shorter articles (very short intro/discussion) that confirmed previous results. Straight away it seems like a good thing for the PLoS system to be doing.

    This would benefit the science, highlighting replicable results, and might also, for example, become a good starting point for new students - learning to carry out experiments, manipulate data etc through replicating a result from their chosen field, and getting a publication out of it, before moving onto the hallowed ground of the novel.

    Second, science, and neuroscience in particular, is increasingly based on enormous and complex data sets. Publications tend to contain analysis of only part of the data. Examples would be EEG, fMRI, and other types of neurophysiology. In many cases approaches to analysis are yet to be standardized, and frankly in many cases they are somewhat subjective. An idea might be to make publication of the raw data set a condition of publication of the paper, with the idea that others would have the right to apply their own analysis approaches in an attempt to confirm that the published result is robust to a range of different analytic techniques.

    I suppose this may be a little idealistic, and in fact lead to people trying to take donw their rivals' results. But I'm pretty sure that eventually, if only because the vast cost of acquiring such data sets will mean that funders will demand it, these data will be shared in a routine fashion. I know that's what we'll be doing, if we ever finish collecting the damn stuff...

    ReplyDelete
  5. How many journals really insist on formatting guidelines being followed at first submission? None I have worked for, certainly. The more usual procedure is for in-house editors to ignore formatting when assessing the first submission, and if they invite revision they then point out formatting issues that should be fixed in the revised version. Even then, things like reference formats are usually sorted out by copyeditors rather than expecting authors to do it (except for journals that don't copyedit, but even they do basic reformatting to their style, I think).

    I would love to hear examples of journals that have insisted on perfect formatting at first submission. I would recommend that authors not worry about that until a much later stage.

    I agree on your main point, but it will take a big culture shift among both editors and researchers to change things. This post will move things in the right direction. @crewilson's suggestion of a replications section is a great one.

    ReplyDelete
  6. "How many journals really insist on formatting guidelines being followed at first submission? None I have worked for, certainly."

    The Journal of Vertebrate Paleontology is notoriously picky about formatting of submissions, and routinely sends manuscripts back if they do not exactly adhere to the guidelines. I know people who have simply given up submitting to that journal now because it isn't worth the grief it causes.

    "Even then, things like reference formats are usually sorted out by copyeditors rather than expecting authors to do it."

    That is not true in my field (palaeontology, if you didn't guess). Probably varies between fields?

    ReplyDelete
  7. Regarding Anna Sharman's point, it is rather meaningless to say "editors at XYZ journal don't demand perfect formatting' if the same journal's public Instructions to Authors, prominently displayed on their website, have the long and detailed instructions and the usual line like "submissions must...".

    If a journal actually is more flexible... well, it may be that people who regularly publish in the particular journal, or who have been an editor of the journal they mostly publish in, get to know this... but us poor mug punters don't.

    To give a real example, as I stated above, three different high-impact medical research journals we sent our paper to all had such statements, each requiring re-formatting. One journal then asked us to re-cast our paper as a rapid communication (entailing a major re-write to their editorial rules) and then went on to reject it at editorial level again without full peer review. Thanks a bunch, we thought.

    It is also a fact that these kind of detailed formatting instructions have all grown more onerous, not less, over the quarter-century I have been in the business. If journals and publishers don't really insist on them, why have they all put them in place?

    ReplyDelete
  8. One journal that we have published in a few times now asks for formatting exactly like the finished thing. I have always done this (in the - perhaps mistaken - belief that it sends a subconscious signal to the referees that the paper is ready for publication!). But I have reviewed loads of times for this journal and many other authors don't do it all. So it's clearly not enforced. I guess this is the difference between "Instructions to Authors" that Dr Aust is talking about and the experience of Anna Sharman. Some authors don't bother and at some journals its not rigorously enforced.

    ReplyDelete
  9. Anna: here's a couple of examples. First, J Neuroscience (which charges for *submissions*) - you send off paper and get a bounce saying:
    "In checking in your manuscript submitted to Journal of Neuroscience it has come to our attention that the following must be addressed before we can begin the peer review process." - then a list of trivia
    Or Psychophysiology, who were a bit more gracious but said:
    "Psychophysiology has recently adopted the Sixth Edition of the APA Style Manual for manuscripts submitted to the journal. Your manuscript was submitted in the format from the Fifth Edition of the APA manual. Since this is a recent change, we will allow your manuscript to go out for review in its current format"

    ReplyDelete
  10. It must be annoying--really annoying--to have to re-format a submission that is then rejected without review, but that isn't the main point of the original post. The main point is the systematic distortion of the scientific literature by the passive file-drawer process compounded by the active unwillingness of editors to publish replication studies.

    Replication is the foundation of science. Without replication studies, the literature will become a sort of Ripley's Believe-It-Or-Not freak show of oddities like shrunken heads, two-headed animals, and tattooed ladies. We're not there yet, are we?

    ReplyDelete
  11. While I appreciate some of the points you make, I hate to be Pollyanna, but i think that still overall the system works. Yes many of the sexy journals favor "novelty" over replication. But, in many fields, truly novel and controversial results are either born out by replication, or left to the garbage heap of science, and I think committees often can differentiate between the two. For example, the 1983 Science article by Baldwin and Schultz claimed that trees communicate via volatile signals was originally met with not a small amount or ridicule, but has since been recognized as a paradigm shift, and has been cited in follow up articles that did more than replicate, they added new data and refinements; many of these articles were also in high impact journals. The same can be said of prions, microRNA, and any of a number of novel studies which were subsequently replicated and built on. Replication does not preclude high-impact publication. And reject without review, while annoying and often seemingly subjective, does in the end usually save time.

    ReplyDelete
  12. We've created www.PsychFileDrawer.org to address the file-drawer problem. And if people post, it'll become a partial solution. Let us know how we might improve the site to encourage posting. Would you be willing to post your replication of Shafer et al?

    ReplyDelete
  13. Alex: thanks for the FileDrawer idea, but this is one of the best things I've done for a while, and I plan to publish it in a journal.
    Geoff: yes yes yes - and thanks for steering discussion back to important stuff.
    Danny: no no no. At least, not in my field. We get a lot of sexy stuff that is almost certainly spurious, and since nobody tries to replicate it, you don't know what to believe.

    ReplyDelete
  14. Sorry for anonymity (I do not have an account).
    I am a graduate student with an experience of publishing several papers. I am completely agree with both your points. The Science cannot make a progress without replication. Given a variability of factors in behavioral studies the chances that the result was got by chance are not zero at all. In addition, I have a gut feeling that people too frequently get the result they want to get...
    It's noteworthy, that supporting too much replication research may result that everyone would make replication only. So, it's not good as well.
    As a practical step, had I been the journal editor, I would have thought about very structured letter to editor form (not the stories that we have today). Consider, for example, the Plos Biology:
    http://www.plosbiology.org/static/guidelines.action
    You have the chance to motivate your study, even if it replication. Clearly, more steps can be taken in this direction.

    ReplyDelete
  15. Alex/Anonymous.
    Just to clarify: my study was not "just" a replication and was very well motivated. I went way beyond what the original study had done and, to my mind, the paper makes a considerable theoretical advance, with further analyses clarifying direction of causation. That's why I sent it to a top journal. With luck it'll get published before too long and the world will be able to see it!

    ReplyDelete
  16. I have a minor and major complaint, I'll start with the small fry. If you cannot produce error free prose (spelling and grammar are correct), then you should not be authoring documents for other's to read. If you find this tedious, well, you should know the rest of us find that absurd. A spell/grammar checker and a quick line edit will get 99%. You'll find if you do just this basic stuff you'll likely be forgiven for any other oversights. If you find the whole topic intimidating, go read the Elements of Style and keep it by your desk when writing. An investment of a few hours of reading will make a big difference. As for the rest, it's an indication of one's attention to detail to many of us, so even if that's incorrect in your case, you'd be wise to consider how your writing style influences the context of what you are saying, always and everywhere.

    On to the larger prey. The 'system' you describe in which there is a huge overproduction of "publishable" work being cherry-picked by editors to drive their readership up. The unmentioned factor in the article is that readership appetites are probably big drivers of this, perhaps there are some feedbacks, maybe some positive ones, but we are left at the level of editor which ignores a serious look at the incentives and interests they are facing.

    I have a speculation/hunch (not a hypothesis) about what may be happening. I've become very interested in how inflation of various types moves through an economy. There are more than a few economists at places like GMU who opine on the affects of government subsidies in education, and this phenomena struck me as a possible candidate for an adaptive reaction to the flood of money into academia over the past 25 years.

    One way to see it is that the stakes have been raised across the board. So there are more folks who are competing for a bigger prize and for a place in a bigger prize pool. Academia isn't like free markets - second place is very valuable, whereas in free markets second place often means zero, no sale. I'd love to see some accopanying analysis of the growth of grant dollars, fellowships, pay, etc to accompany the behavior of the players.

    What many social scientists miss about free markets is that their voluntary nature makes them highly sensitive to payoffs in ways that bureaucracy can never achieve. With volume, behavior attenuates in ways that respond to even subtle changes in the institutional structures within which they operate. Massive changes like the advent of a 300%+ increase in revenue for higher education (in real terms over the past 30 years) will axiomatically affect the behavior of the players involved

    For example, I wouldn't be surprised to find out that there is a positive feedback between readership of the novel and the publication of more novel articles. But the why is most important. Could it be that there are more journals too? And that they are facing more competition? Could this be an caused to some degree by the wildfire of spending in the higher education system? How much has the premium for publication - by individuals and institutions - increased in the same time-frame?

    I think to wander into this topic like some wide-eyed lamb, decrying the baseness of the participants is a bit naive. This is what happens in a "bubble" economy of anything. The quality would predictably go down as in such a 'gold rush' kind of market people of questionable morals are evermore tempted. Just think about that autism quack in the U.K. (blanking, sorry). Think about the stakes involved. Are they higher than ever?

    Again, I'm speculating, but really, a deeper look is called for. When independent evaluations of the work-product of a given system, in this case, the academic research community in general, find that it's results are invalid and unreliable, one should be given great pause. Don't kid yourselves, this is already ending badly and will only get worse.

    ReplyDelete