BishopBlog: research methods

Sunday, 24 March 2024

Just make it stop! When will we say that further research isn't needed?

I have a lifelong interest in laterality, which is a passion that few people share. Accordingly, I am grateful to René Westerhausen who runs the Oslo Virtual Laterality Colloquium, with monthly presentations on topics as diverse as chiral variation in snails and laterality of gesture production.

On Friday we had a great presentation from Lottie Anstee who told us about her Masters project on handedness and musicality. There have been various studies on this topic over the years, some claiming that left-handers have superior musical skills, but samples have been small and results have been mixed. Lottie described a study with an impressive sample size (nearly 3000 children aged 10-18 years) whose musical abilities were evaluated on a detailed music assessment battery that included self-report and perceptual evaluations. The result was convincingly null, with no handedness effect on musicality.

What happened next was what always happens in my experience when someone reports a null result. The audience made helpful suggestions for reasons why the result had not been positive and suggested modifications of the sampling, measures or analysis that might be worth trying. The measure of handedness was, as Lottie was the first to admit, very simple - perhaps a more nuanced measure would reveal an association? Should the focus be on skilled musicians rather than schoolchildren? Maybe it would be worth looking at nonlinear rather than linear associations? And even though the music assessment was pretty comprehensive, maybe it missed some key factor - amount of music instruction, or experience of specific instruments.

After a bit of to and fro, I asked the question that always bothers me. What evidence would we need to convince us that there is really no association between musicality and handedness? The earliest study that Lottie reviewed was from 1922, so we've had over 100 years to study this topic. Shouldn't there be some kind of stop rule? This led to an interesting discussion about the impossibility of proving a negative and whether we should be using Bayes Factors, and what would be the smallest effect size of interest.

My own view is that further investigation of this association would prove fruitless. In part, this is because I think the old literature (and to some extent the current literature!) on factors associated with handedness is at particular risk of bias, so even the messy results from a meta-analysis are likely to be over-optimistic. More than 30 years ago, I pointed out that laterality research is particularly susceptible to what we now call p-hacking - post hoc selection of cut-offs and criteria for forming subgroups, which dramatically increase the chances of finding something significant. In addition, I noted that measurement of handedness by questionnaire is simple enough to be included in a study as a "bonus factor", just in case something interesting emerges. This increases the likelihood that the literature will be affected by publication bias - the handedness data will be reported if a significant result is obtained, but otherwise can be disregarded at little cost. So I suspect that most of the exciting ideas about associations between handedness and cognitive or personality traits are built on shaky foundations, and would not replicate if tested in well-powered, preregistered studies. But somehow, the idea that there is some kind of association remains alive, even if we have a well-designed study that gives a null result.

Laterality is not the only area where there is no apparent stop rule. I've complained of similar trends in studies of association between genetic variants and psychological traits, for instance, where instead of abandoning an idea after a null study, researchers slightly change the methods and try again. In 2019, Lisa Feldman Barrett wrote amusingly about zombie ideas in psychology, noting that some theories are so attractive that they seem impossible to kill. I hope that as preregistration becomes more normative, we may see more null results getting published, and learn to appreciate their value. But I wonder just what it takes to get people to conclude that a research seam has been mined to the point of exhaustion.

Friday, 9 February 2018

Improving reproducibility: the future is with the young

I've recently had the pleasure of reviewing the applications to a course on Advanced Methods for Reproducible Science that I'm running in April together with Marcus Munafo and Chris Chambers. We take a broad definition of 'Reproducibility' and cover not only ways to ensure that code and data are available for those who wish to reproduce experimental results, but also focus on how to design, analyse and pre-register studies to give replicable and generalisable findings.

There is a strong sense of change in the air. Last year, most applicants were psychologists, even though we prioritised applications in biomedical sciences, as we are funded by the Biotechnology and Biological Sciences Research Council and European College of Neuropsychopharmacology. The sense was that issues of reproducibility were not not so high on the radar of disciplines outside psychology. This year things are different. We again attracted a fair number of psychologists, but we also have applicants from fields as diverse as gene expression, immunology, stem cells, anthropology, pharmacology and bioinformatics.

One thing that came across loud and clear in the letters of application to the course was dissatisfaction with the status quo. I've argued before that we have a duty to sort out poor reproducibility because it leads to enormous waste of time and talent of those who try to build on a glitzy but non-replicable result. I've edited these quotes to avoid identifying the authors, but these comments – all from PhD students or postdocs in a range of disciplines - illustrate my point:

'I wanted to replicate the results of an influential intervention that has been widely adopted. Remarkably, no systematic evidence has ever been published that the approach actually works. So far, it has been extremely difficult to establish contact with initial investigators or find out how to get hold of the original data for re-analysis.'

'I attempted a replication of a widely-cited study, which failed. Although I first attributed it to a difference between experimental materials in the two studies, I am no longer sure this is the explanation.'

'I planned to use the methods of a widely cited study for a novel piece of research. The results of this previous study were strong, published in a high impact journal, and the methods apparently straightforward to implement, so this seemed like the perfect approach to test our predictions. Unfortunately, I was never able to capture the previously observed effect.'

'After working for several years in this area, I have come to the conclusion that much of the research may not be reproducible. Much of it is conducted with extremely small sample sizes, reporting implausibly large effect sizes.'

'My field is plagued by irreproducibility. Even at this early point in my career, I have been affected in my own work by this issue and I believe it would be difficult to find someone who has not themselves had some relation to the topic.'

'At the faculty I work in, I have witnessed that many people are still confused about or unaware of the very basics of reproducible research.'

Clearly, we can't generalise to all early-career researchers: those who have applied for the course are a self-selected bunch. Indeed, some of them are already trying to adopt reproducible practices, and to bring about change to the local scientific environment. I hope, though, that what we are seeing is just the beginning of a groundswell of dissatisfaction with the status quo. As Chris Chambers suggested in this podcast, I think that change will come more from the grassroots than from established scientists.

We anticipate that the greater diversity of subjects covered this year will make the course far more challenging for the tutors, but we expect it will also make it even more stimulating and fun than last year (if that is possible!). The course lasts several days and interactions between people are as important as the course content in making it work. I'm pretty sure that the problems and solutions from my own field have relevance for other types of data and methods, but I anticipate I will learn a lot from considering the challenges encountered in other disciplines.

Training early career researchers in reproducible methods does not just benefit them: those who attended the course last year have become enthusiastic advocates for reproducibility, with impacts extending beyond their local labs. We are optimistic that as the benefits of reproducible working become more widely known, the face of science will change so that fewer young people will find their careers stalled because they trusted non-replicable results.

BishopBlog

Sunday, 24 March 2024

Just make it stop! When will we say that further research isn't needed?

Friday, 9 February 2018

Improving reproducibility: the future is with the young

Search This Blog

Prizewinning blog

Popular Posts

Blog Archive

Contributors

Followers

BishopBlog

Sunday, 24 March 2024

Just make it stop! When will we say that further research isn't needed?

Friday, 9 February 2018

Improving reproducibility: the future is with the young

Search This Blog

Subscribe To

Prizewinning blog

Popular Posts

Blog Archive

Contributors

Followers