Friday 6 August 2010

How our current reward structures have distorted and damaged science

copyright www.CartoonStock.com
Two things almost everyone would agree with:

1. Good scientists do research and publish their results, which then have impact on other scientists and the broader community.

2. Science is a competitive business: there is not enough research funding for everyone, not enough academic jobs in science, and not enough space in scientific journals. We therefore need ways of ensuring that the limited resources go to the best people.

When I started in research in the 1970s, research evaluation focused on individuals. If you wrote a grant proposal, applied for a job, or submitted a paper to a journal, evaluation depended on peer review, a process that is known to be flawed and subject to whim and bias, but is nevertheless regarded by many as the best option we have.

What has changed in my lifetime is the increasing emphasis on evaluating institutions rather than individuals. The 1980s saw the introduction of the Research Assessment Exercise, used to evaluate Universities in terms of their research excellence in order to have a more objective and rational basis for allocating central funds (quality weighted research funding or QR) by the national funding council (HEFCE in England). The methods for evaluating institutions evolved over the next 20 years, and are still a matter of debate, but they have subtly influenced the whole process of evaluation of individual academics, because of the need to use standard metrics.

This is inevitable, because the panel evaluating a subject area can't be expected to read all the research produced by staff at an institution, but they would be criticised for operating an 'old boy network', or favouring their own speciality, if they relied just on personal knowledge of who is doing good work – which was what tended to happen before the RAE. Therefore they are forced into using metrics. The two obvious things that can be counted are research income and number of publications. But number of publications was early on recognised as problematic, as it would mean that someone with three parochial reports in a journal of national society would look better than someone with a major breakthrough published in a top journal. There has therefore been an attempt to move from quantity to quality, by taking into account the impact factor of the journals that papers are published in.

Evaluation systems always change the behaviour of those being evaluated, as people attempt to maximise rewards. Recognising that institutional income depends on getting a good RAE score, vice-chancellors and department heads in many institutions now set overt quotas for their staff in terms of expected grant income and number of publications in high impact journals. The jobs market is also affected, as it becomes clear that employability depends on how good one looks on the RAE metrics.

The problem with all of this is that it means that the tail starts to wag the dog. Consider first how the process of grant funding has changed. The motivation to get a grant ought to be that one has an interesting idea and needs money to investigate it. Instead, it has turned into a way of funding the home institution and enhancing employability. Furthermore, the bigger the grant, the more the kudos, and so the pressure is on to do large-scale expensive studies. If individuals were assessed, not in terms of grant income, but in terms of research output relative to grant income, many would change status radically, as cheap, efficient research projects would rise up the scale. In psychology, there has been a trend to bolt on expensive but often unmotivated brain imaging to psychological studies, ensuring that the cost of each experiment is multiplied at least 10-fold. Junior staff are under pressure to obtain a minimum level of research funding, and consequently spend a great deal of time writing grant proposals, and the funding agencies are overwhelmed with applications. In my experience, applications that are written because someone tells you to write one are typically of poor quality, and just waste the time of both applicants and reviewers. The scientist who is successful in meeting their quota is likely to be managing several grants. This may be a good thing if they are really talented, or have superb staff, but in my experience research is done best if the principal investigator puts serious thought and time into the day-to-day running of the project, and that becomes impossible with multiple grants.

Regarding publications, I am old enough to have been publishing before the RAE, and I'm in the fortunate but unusual position of having had full-time research funding for my whole career. In the current system I am relatively safe, and I look good on an RAE return. But most people aren't so fortunate: they are trying to juggle doing research with teaching and administration, raising children and other distractions, yet feel under intense pressure to publish. The worry about the current system is that it will encourage people to cut corners, to favour research that is quick and easy. Sometimes, one is lucky, and a simple study leads to an interesting result that can be published quickly. But the best work typically requires a large investment of time and thought. The studies I am proudest of are ones which have taken years rather than months to complete: in some cases, the time is just on data collection, but in others, the time has involved reading, thinking, and working out ways of analysing and interpreting data. But this kind of paper is getting increasingly rare. As a reviewer, I frequently see piecemeal publication, so if you suggest that a further analysis would strengthen the paper, you are told that it has been done, but is the subject of another paper. Scholarship and contact with prior literature has become extremely rare: prior research is cited without reading it – or not cited at all – and the notion of research building on prior work has been eroded to the point that I sometimes think we are all so busy writing papers that we have no time to read them. There are growing complaints about an 'avalanche' of low-quality publications.

As noted above, in response to this, there has been a move to focus on quality rather than quantity of publications, with researchers being told that their work will only count if it is published in a high-impact journal. Some departments will produce lists of acceptable journals and will discourage staff from publishing elsewhere. In effect, impact factor is being used as a proxy for likelihood that a paper will be cited in future, and I'm sure that is generally true. But just because a paper in a high impact journal is likely to be highly cited, it does not mean that all highly-cited papers appear in high impact journals. In general, my own most highly-cited papers appeared in middle-ranking journals in my field. Moreover, the highest impact journals have several limitations:

1. They only take very short papers. Yes, it is usually possible to put extra information in 'supplementary material', but what you can't do is to take up space putting the work in context or discussing alternative interpretations. When I started in the field, it was not uncommon to publish a short paper in Nature, followed up with a more detailed account in another, lowlier, journal. But that no longer happens .We only get the brief account.

2. Demand for page space outstrips supply. To handle a flood of submissions, these journals operate a triage system, where the editor determines whether the paper should go out for review. This can have the benefit that rejection is rapid, but it puts a lot of power in the hands of editors, who are unlikely to be specialists in the subject area of the paper, and in some cases will explicit in their preference for papers with a 'wow' factor. It also means that one gets no useful feedback from reviewers: viz my recent experience with the New England Journal of Medicine, where I submitted a paper that I thought had all the features they'd find attractive – originality, clinical relevance and a link between genes and behaviour. It was bounced without review, and I emailed, not to appeal, but just to ask if I could have a bit more information about the criteria on which they based their rejection. I was told that they could not give me any feedback as they had not sent it out for review.


3. If the paper does go out for review, the subsequent review process can be very slow. There's an account of the trials and tribulations of dealing with Nature and Science which makes for depressing reading. Slow reviewing is clearly not a problem restricted to high impact journals. My experience is that lower-impact journals can be even worse. But the impression from the comments on FemaleScienceProfessor's blog, is that reviewers can be unduly picky when the stakes are high.

So what can be done? I'd like to see us return to a time when the purpose of publishing was to communicate, and the purpose of research funding was to enable a scientist to pursue interesting ideas. The current methods of evaluation have encouraged an unstoppable tide of publications and grant proposals, many of which are poor quality. Many scientists are spending time on writing doomed proposals and papers when they would be better off engaging in research and scholarship in a less frenetic and more considered manner. But they won't do that so long as the pressures are on them to bring in grants and generate publications. I'll conclude with a few thoughts on how the system might be improved.

1. My first suggestions, regarding publications, are already adopted widely in the UK, but my impression is they may be less common elsewhere. Requiring applicants for jobs or fellowships to specify their five best publications rather than providing a full list rewards those who publish significant pieces of work, and punishes piecemeal publication. Use of the H-index as an evaluation metric rather than either number of publications or journal impact factor is another way to encourage people to focus on producing substantial papers rather than a flood of trivial pieces, as papers with low citations have no impact whatever on the H-index.There are downsides: we have the lag problem, which makes the H-index pretty useless for evaluating junior people, and in its current form the index does not take into account the contribution of authors, thereby encouraging multiple authorship, since anyone who can get their name on a highly-cited paper will boost their H-index, regardless of whether they are a main investigator or freeloader.

2. Younger researchers should be made aware that a sole focus on publishing in very high impact journals may be counter-productive. Rapid publication in an Open Access journals (many of which have perfectly respectable impact factors) may be more beneficial to ones career (http://openaccess.eprints.org/) because the work is widely accessible and so more likely to be cited. A further benefit of the PLOS journals, for instance, is that they don't impose strict length limits, so research can be properly described and put in context, rather than being restricted to the soundbite format required by very high impact journals.

3. Instead of using metrics based on grant income, those doing evaluations should use those based on efficiency, i.e. an input/output function. Two problems here: the lag in output is considerable, and the best metric for measuring output is unclear. The lag means it would be necessary to rely on track record, which can be problematic for those starting out in the field. Nevertheless, a move in this direction would at least encourage applicants and funders to think more about value for money, rather than maximising the size of a grant – a trend that has been exacerbated by Full Economic Costing (don't get me started on that). And it might make grant-holders and their bosses see the value of putting less time and energy into writing multiple proposals and more into getting a project done well, so that it will generate good outputs on a reasonable time scale.

4. The most radical suggestion is that we abandon formal institutional rankings (i.e. the successor to the RAE, the REF). I've been asking colleagues who were around before the RAE, what they think it achieved. The general view was that the first ever RAE was a useful exercise that exposed weaknesses in insitutions and individuals and got everyone to sharpen up their act. But the costs of subsequent RAEs (especially in terms of time) have not been justified by any benefit. I remember a speech given by Prof Colin Blakemore at the British Association for the Advancement of Science some years ago where he made this point, arguing that rankings changed rather little after the first exercise, and certainly not enough to justify the mammoth bureaucratic task involved. When I talk to people who have not known life without an RAE, they find it hard to imagine such a thing, but nobody has put forward a good argument that has convinced me it should be retained. I'd be interested to see what others think.


10 comments:

  1. Thanks for this excellent exposition of a very real problem, namely the corruption of science. In my view thinks like the RAE, and, even worse, some of the bone-headed bean counters within universities, have achieved the exact opposite of their declared intention. The pressure to publish runs the risk of changing science into a branch of spivery rather than the honest occupation that attracted me in my youth. There is a tendency to reward speed and shallowness that results in the promotion of skilful marketing rather than good science. Worse still, the operators so rewarded now appear on journal editorial boards and on the panels of funding agencies where they tend to favour more of their ilk.

    Before anyone suggests that these comments stem from sour grapes, I should say that I've been lucky. I had funding (from MRC and Wellcome) throughout my life and a reasonable share of publications in Nature (though admittedly some of the latter look pretty trivial in retrospect). I have no cause to grumble on my own account, but my heart bleeds for people at the beginning of their careers who are sometimes told by idiotic administrators that they must publish in particular journals if they want promotion That sort of thing is an active encouragement to dishonesty, and of course the research assessment and Full Economic Costing., sensible though it sounds, has caused universities to behave dishonestly too.  Another example of official encouragement of dishonesty is to be found in the absurd "impact agenda", in which you are asked to predict the eventual applications of discoveries which have not yet been made (see A modest revolt . . ).

    Some excellent suggestions for ways to reduce corruption have been made by Philip Moriarty (Reclaiming academia from post-academia), and Peter Lawrence (Real Lives and White Lies in the Funding of Scientific Research)

    The only answer that I can think of is too put science back into the hands of scientists. It is not the concern of the psychobabble experts in HR and it is not the concern of the perfidious generation of bibliometricians who do no science themselves but presume to judge those who do.

    (continued . . .)

    ReplyDelete
  2. (continued)
    Perhaps the most important thing to do is to limit the number of publications. For years we asked job applicants to submit only their three of four best publications and we read them, and asked applicants about them (it was amazing, and revealing, to see how little many applicants seemed to know about 'their' best publications).

    I feel increasingly that the only way to break the tyranny of a handful of journals is to abandon journals altogether. Eventually perhaps, we could all simply self-publish on the web, and open the comments section. Of course there would be a great deal of low quality stuff but there is a great deal of that already. There is no paper so bad that it won't be accepted by some 'peer-reviewed' journal or other. Nothing could illustrate better the low value of 'peer review' than the fact that Pubmed no lists many journals that deal with alternative medicine, in which the effects of magic beans are peer-reviewed by experts in magic beans. The only effect of the enormous amount of time spent on refereeing papers is to maintain a hierarchy of journals, but it does little or nothing to improve the standards of what gets published. We pay the journals to publish it, and pay them again to buy the journals. That is a wonderful gravy train for publishers but does nothing for science. Scientific publishers do more harm than good. They are no longer needed.

    It will take time to see the effects of all these pressures on young researchers. But it does seem odd that universities and the government should adopt policies which seem to be designed to fire every potential FRS and Nobel prize-winner before they get started.

    ReplyDelete
  3. As a relatively junior researcher I can’t thank you both enough for highlighting the insanity of the current system. I grew up in Dorothy’s lab where the emphasis was on doing good work, publishing rapidly and in journals that targeted the people we most wanted to communicate with, in this case clinicians and educators. As a result, I finished my PhD with four first or single authored publications in respectable, though not top tier, journals. In fact, I didn’t even know what an impact factor was when I finished my PhD (I used to be a clinician and believe me impact factor means nothing to anyone outside of academia). However, this approach served me well, I have made significant impact in my field and this enabled me to get a post-doctoral Fellowship and then a permanent position in a very good Psychology department.

    But life is very different here. We have explicit grant income targets. To achieve these targets we are expected to apply for at least two grants a year, every year. There is not only an expectation that ‘good’ people have multiple grants on the go, there is huge incentive too. We are publicly ranked according to how much money we bring in, more money pretty much guarantees promotion, resources and lab space and our ‘workload’ model means that grant income also brings a reduction in teaching and admin load. The emphasis on money is also highlighted by the fact that I cannot get anyone in the department to review and give feedback on an application until I have had all of the full economic costs calculated. We are also told to which journals we should and should not submit our papers – we are expected to submit at least 2 papers a year to 3* and 4* journals.

    On top of all the other things I do, this is a tall order, but I have managed to hit the submission targets (despite having a baby as well). As far as I can tell though, there is a very small (and surely insignificant) correlation between quality of ideas and work and the ability to get funded or published in a 4* journal. Even when there is success (and one of my four grants did get funded), I attribute this largely to luck of the reviewer draw. Same with papers – most of my outputs (a book, two chapters in psychiatry textbooks, and four journal articles) are deemed unacceptable to the pre-REF committee. The one paper I was most excited about was sent to a ‘high-impact’ journal and has since spent 2 years in review. It is still not published and now only brings me sorrow.

    Sometimes I am completely demoralised by the whole process, but I do love what I do and have enough support and self-confidence to continue with a line of research that endlessly fascinates me. However, there are a number of things I worry about:
    a. Critical comments sink grant applications. People who know how to play the UK game know this and don’t say anything that might be perceived as negative in a review. People outside the UK have the reasonable expectation that constructive criticism improves science, that applicants will have a right to reply, and an opportunity to use constructive comments to improve the science and resubmit the grant. If your grant goes to latter – unlucky! But surely this is the right approach. My two rejected grants both got high scores (8/10 for the last one) and generally glowing comments. Negative comments were often fair, but sometimes just wrong. In neither case could I address reviewer concerns, nor can I resubmit the grants. So although the consensus from reviewers was that both were fundable, both ideas are now dead. This is just a monumental waste of time for everyone involved.

    ReplyDelete
  4. fEC. Don’t get anyone started on it, but surely it is reducing the number of grants that can be funded and thus damaging science.
    c. Genetics and neuroimaging attract more money (and thus more fEC) and higher impact publications. Departments are prioritising this line of research and those of us doing cognitive and behavioural research are facing pressure to use neuroimaging methods to ‘validate’ what we do. I think both are important and have their place, but just including a brain imaging technique in a study should not be equated with better science.
    d. Notion of ‘impact’. Now, not all scientific endeavours have immediate ‘impact’ and that should not matter in the least. But my research might influence the way people assess a child’s communication skills or how they approach intervention and I would consider this impact. But the best way to achieve this impact is to speak to clinicians directly rather than publishing in some high impact journal that they’ve never heard of, have no access to and therefore will never read. I spend a lot of time and energy doing this, but it “doesn’t count” (although it is one aspect of my job I really enjoy and am very proud of). I am contrasted with people who hit on a sexy topic which has no bearing on anything in the real world and is hotly contested – this attracts lots of ‘high-impact’ publications and lots of citations as different camps constantly cite each other complaining about the flaws in stimuli or duration of inter-stimulus interval. Is this really greater impact?
    e. Impact part 2. Sure there is rubbish printed in ‘low-impact’ journals, but there is a considerable amount of rubbish published in high-impact journals too. Anything that can be condensed into a media friendly sound bite is probably not all it is cracked up to be.
    f. We can’t do anything risky. It seems to me that the great scientists of our time are ones who spent lots of time working on a particular problem. Some of the things they tried didn’t work –they learned from these mistakes and got there in the end. We do not have the luxury of learning from mistakes – at the end of the grant we are assessed again and what is the metric – publications! And what if we get a null result and can’t get it published? Disaster! When I think about my paper that has been in review for two years I often think that in an ideal world, I would not have published it yet – it is novel, the numbers are small (it is bloody hard work eye-tracking 6-year-olds with autism) and as always I can think of ways to improve the design and analysis. Ideally I would view these as pilot data, replicate the findings on a different and larger sample, tweak the stimuli to see if we get the same results, take the speculative hypotheses currently in the discussion and test them. But it was a one-year grant – to do all of that takes time (and money) and without a publication at the end of the grant, we can’t get funding to follow it up. And so it goes. But this makes me very reluctant indeed to take on anything riskier – intervention studies for example. And this doesn’t seem a good way to move science forward.

    I’m not sure if there is a solution to these worries, particularly in the current climate. But it is encouraging to me that senior scientists are highlighting the insanity of the current system and thinking aloud about how to fix it. This sends a clear message to my colleagues, many of whom are really struggling to realise a future in science, that there is a different way of working.

    ReplyDelete
  5. @Courtenay
    It seems you've suffered more than most from the bone-headed bean counters. I'd be interested to know more.

    You were either very clever or very lucky to get four papers during your PhD, but that sort of thing is very dependent on what area you are in. It isn't long ago that it took over three years to get a single channel project finished to our satisfaction. Eventually it worked out quite well and became a Nature article (though it could have failed). That doesn't prevent idiotic bean counters from casting doubt on the publication rate of the quite brilliant postdoc who did much of the work. Some dumkopf casts his or her eye down a CV and jumps to a judgement on utterly inadequate evidence. That's the sort of thing that will kill good science if we don't come to our senses.

    ReplyDelete
  6. I left science after several years as a postdoc, rather disillusioned by the whole show. I was probably too idealistic to play the system. I tried to have good ideas, frame them as hypotheses and test them, but at times I found myself encouraged to fish around and see if anything interesting happened; I thought that experiments should be done to answer specific questions or address specific problems but at times I heard things like " What experiment do we need to do to get this paper into this journal?"; I thought that scientists should have promising ideas and then seek funding for them but I saw people seeking sources of funding and then trying to cobble together some ideas to get the money. Perhaps the latter doesn't happen with the large grants which lab heads seek but I certainly saw this with small grants to support individual postdocs.

    At one point, a fellow postdocs said "If you want to continue in science, you should be looking for your own money". Admittedly, I was going to need money at some point, but at the time I was thinking that to continue in science I needed to be doing decent work on important questions.

    I was never more than moderately good at what I did so my departure is no great loss to science, but there must be very talented people out there who are turning away from science careers.

    ReplyDelete
  7. I've just found a very useful document that gives the historical background to research funding in the UK:
    http://tiny.cc/itlc5

    ReplyDelete
  8. I just found this post - and I think it's a great one! As someone who does both both "sexy" and "unsexy" research, I am frequently struck by how much more attention the "sexy" research receives - e.g., brain scan/gene reveals differences between X disorder and Y disorder or between X and controls ! - and how I can market it more frequently to higher tier journals. I consider my "unsexy" research line to often be far more theoretically sound, but by default, it seems to be relegated to specialty journals in my sub-field.

    Additionally, as you mentioned (and I wish you had gone into this in greater detail), the review process itself is quite whimsical. It's been my experience that each journal seems to have a small coterie who regularly publish in it and heavily police all other work that comes into it. This is something that has worked both in my favor and against me. My graduate school mentor is a well-known name in certain circles, and I am sure just having their name on some of my manuscripts helped publication.

    I have had the complete opposite experience as well, when I have tried for journals that we normally don't publish in. One recent experience really soured the whole review process for me. I had recently submitted a paper to what is widely considered the "best" journal in my discipline. I can easily say the reviews I got were the most biased. For example, one of the reviewers stated that they did not understand some of our analyses and were not familiar with the software we had used, but still proceeded to say that our results were incorrect! Another review for the same ms was about 5 - 6 lines long and simply said that my study didn't make sense. Why, you ask? Because it contradicted this reviewer's theory and they *knew* their theory was right! Of course, no citations were provided to back up such assertions. Following this, I resubmitted the same manuscript to other journals, where this aforementioned reviewer was asked again to review the paper. This person then proceeded to contact me outside of the review process and when the paper was still under review (!) by email to insist their theory was correct and that I was wrong to even conduct my study!

    I have heard of similar ridiculous stories (see the humorous "Reviewer 2 Must Be Stopped" Facebook page - http://www.facebook.com/group.php?gid=71041660468 for some great examples). Naturally, we all have our biases in reviewing grants, papers, etc., but such blatant manipulation of what is considered fundable and publication-worthy is such a gross distortion of the whole process of science.

    ReplyDelete
  9. Charlie Wilson (@crewilson)1 March 2011 at 22:39

    I'm even earlier in my career Courtenay, and also very much appreciate everything above. I have only had one encounter with bean counting madness so far. It just made me laugh.

    What I really want is quite simple - time. I'm finishing a 2 year postdoc on one project, and am lucky to have a European fellowship for 2 years, albeit on a different project. After that, funding and fellowship options seem to be 3 or 5 years at the most.

    Yet none of my studies are ever likely to take less than 3 or 4 years, even if I do a rush job and throw something at a journal to try and beef up my impact/citations/H-index - my last paper was a 4 and a half year study. Frankly it is this, at the moment, that might push me to make the errors laid out above, more than anything else.

    As I see it, therefore, although I agree with all of your suggestions above, two more things would need to be in place to allow me not to be tempted to act just as you describe:

    1. More time to do my work, hence longer fellowships, grants etc. It's cost neutral, of course, because you give fewer. Getting the European thing was a lot of work, and in amongst the Euro-guff there's lots of aspirational talk of helping early career researchers. So why give me a period of time in which I can hope to do half a project at most?

    2. Core funding. As in, the sort of recurrent funding that smooths over the bumps and reduces the reliance on the mechanisms described above just a little. The Blakemore talk you mention talks about the value of that, and its erosion in the UK. We have a little of it here in France (not for much longer, it seems), and it really completely changes the ethos of the Institute and the interactions between researchers.

    ReplyDelete
  10. Charlie - the encouraging news is that this seems to be the way funding is going and will be very nice indeed - if you can get one of these longer term grants. The (potential) problems are that pretty much all them require you to be in a permanent post, and remaining competitive and/or having the time to carry out the work while carrying a full teaching load is going to be a bit tricky. And of course it will mean that fewer people get funded. at all. At least you have fine wine to enjoy...

    ReplyDelete