Sunday 29 May 2016

Ten serendipitous findings in psychology

The Thatcher Illusion (see below)
I'm a great fan of pre-registration of studies. It is, to my mind, the most effective safeguard against p-hacking and publication bias, the twin scourges that have led to the literature being awash with false positive findings. When combined with a more formal process, as in Registered Reports, it also allows researchers to benefit from reviewer expertise before they do the study, and to take control of the publication timeline.

But one salient objection to pre-registration comes up time and time again: if we pre-register our studies it will destroy the creative side of doing science, and turn it instead into a dull, robotic, cheerless process. We will have to anticipate what we might find, and close our eyes to what the data tell us.

Now this is both silly and untrue. For a start, there's nobody stopping anyone from doing fairly unstructured exploration, which may be the only sensible approach when entering a completely new area. The main thing in that case is to just be clear that this is what it is, and not to start applying statistical tests to the findings. If a finding has emerged from observing the data, testing it with p-values is statistically illiterate.

Nor is there any prohibition on reporting unexpected findings that emerge in the course of a study. Suppose you do a study with a pre-registered hypothesis and analysis plan, which you adhere to. Meanwhile, a most exciting, unanticipated phenomenon is observed in your experiment. If you are going down the kind of registered reports pathway used in Cortex, you report the planned experiment, and then describe the novel finding in a separate section. Hypothesis-testing and exploration are clearly delineated and no p-values are used for the latter.

In fact, with any new exciting observation, any reputable scientist would take steps to check its repeatability, to explore the conditions under which it emerges, and to attempt to develop a theory that can account for it. In effect, all that has happened is that the 'data have spoken' and suggested a new hypothesis, which could potentially be registered and evaluated in the usual way.

But would there be instances of important findings that would have been lost to history if we started using pre-registration years ago? Because I wanted examples of serendipitous findings to test this point, I asked Twitter, and lo, Twitter delivered some cracking examples. All of these predate by many years the notion of pre-registration, but note that, in all cases, having made the initial unexpected observation – either from unstructured exploratory research, or in the course of investigating something else - the researchers went on to shore up the findings with further, hypothesis-driven experiments. What they did not do is to report just the initial observation, embellished with statistics, and then move on, as if the presence of a low p-value guaranteed the truth of the result.

Here are ten phenomena well-known to psychologists that show how the combination of chance and the prepared mind can lead to important discoveries*. Where I could find one, I cite a primary source, but readers should feel free to contribute further background information.

1. Classical conditioning, Pavlov, 1902. 
The conventional account of Pavlov's discovery goes like this: He was a physiologist interested in processes of digestion and was studying the tendency of dogs to salivate when presented with food. He noted that over time, the dogs would salivate when the lab assistant entered the room, even before the food was presented, thus discovering the 'conditioned response': a response that is learned by association. A recent account is here. I was not able to find any confirmation of the serendipitous event in either Pavlov's Nobel speech, or in his Royal Society obituary, so it would be interesting to know if this described anywhere in his own writings or those of his contemporaries.

One thing that I did (serendipitously) discover from the latter source, was this intriguing detail, which makes it clear that Pavlov would never have had any truck with p-values, even if they had been in use in 1902: "He never employed mathematics even in its elementary form. He frequently said that mathematics is all very well but it confuses clear thinking almost to the same extent as statistics."

Suggested by @speech_woman @smomara1 @AglobeAgog 

2. Psychotropic drugs, 1950s 
Chance appears to have played an important role in the discovery of many psychotropic drugs in the early days of psychopharmacology. For instance, tricyclics were initially used to treat tuberculosis, when it was noticed that there was an unanticipated beneficial effect on mood. Even more striking is Hoffman's first-hand account of discovering the psychotropic effects of LSD, which he had developed as a potential circulatory stimulant. After experiencing strange sensations during a laboratory session, Hoffman returned to test the substances he had been working with, including LSD. "Even the first minimum dose of one quarter of a milligram induced a state of intoxication with very severe psychic disturbances, and this persisted for about 12 hours….This first planned experiment with LSD was a particularly terrifying experience because at the time, I had no means of knowing if I should ever return to everyday reality and be restored to a normal state of consciousness. It was only when I became aware of the gradual reinstatement of the old familiar world of reality that I was able to enjoy this greatly enhanced visionary experience".

Suggested by @ollirobinson @kealyj @neuroraf 

3. Orientation-sensitive receptive fields in visual cortex, 1959 
In his Nobel speech, David Hubel recounts how he and Torsten Wiesel were trying to plot receptive fields of visual cortex neurons using dots of light projected onto a screen, with only scant success, when they observed a cell that gave a massive response as a slide was inserted, creating a faint but sharp shadow on the retina. As he memorably put it, "over the audiomonitor, the cell went off like a machine gun". This initial observation led to a rich vein of research, but, again to quote from Hubel "It took us months to convince ourselves that we weren’t at the mercy of some optical artefact".

 Suggested by: @jpeelle @Anth_McGregor @J_Greenwood @theExtendedLuke @nikuss @sophiescott, @robustgar 

4. Right ear advantage in dichotic listening, 1961 
Doreen Kimura reported that when groups of digits were played to the two ears simultaneously, more were reported back from the right than the left ear (review here). This method was subsequently used for assessing cerebral lateralisation in neuropsychological patients, and a theory was developed that linked the right ear advantage to cerebral dominance for language. I have not been able to access a published account of the early work, but I recall being told during a visit to the Montreal Neurological Institute that it had taken time for the right ear advantage to be recognised as a real phenomenon and not a consequence of unbalanced headphones. The method of dichotic listening dated back to Broadbent or earlier, but it had originally been used to assess selective attention rather than cerebral lateralisation.

5. Phonological similarity effect in STM, 1964 
Conrad and Hull (1964) described what they termed 'acoustic confusions' when people were recalling short sequences of visually-presented letters, i.e. errors tended to involve letters that rhymed with the target letter, such as P, D, or G. In preparation for an article celebrating his 100th birthday, I recently listened to a recording of Conrad describing this early work, and explaining that when such errors were observed with auditory presentation, it was assumed they were due to mishearings. Only after further experiments did it become clear that the phenomenon arose in the course of phonological recoding in short-term memory. 

6. Hippocampal place cells, 1971 
In his 2014 Nobel lecture,  John O'Keefe describes a nice example of unconstrained exploratory research: "… we decided to record from electrodes … as the animal performed simple memory tasks and otherwise went about its daily business. I have to say that at this stage we were very catholic in our approach and expectations and were prepared to see that the cells fire to all types of situations and all types of memories. What we found instead was unexpected and very exciting. Over the course of several months of watching the animals behave while simultaneously listening to and monitoring hippocampal cell activity it became clear that there were two types of cells, the first similar to the one I had originally seen which had as its major correlate some non-specific higher-order aspect of movements, and the second a much more silent type which only sprang into activity at irregular intervals and whose correlate was much more difficult to identify. Looking back at the notes from this period it is clear that there were hints that the animal’s location was important but it was only on a particular day when we were recording from a very clear well isolated cell with a clear correlate that it dawned on me that these cells weren’t particularly interested in what the animal was doing or why it was doing it but rather they were interested in where it was in the environment at the time. The cells were coding for the animal’s location!" Needless to say, once the hypothesis of place cells had been formulated, O'Keefe and colleagues went on to test and develop it in a series of rigorous experiments.

7. McGurk effect, 1976 
In a famous paper, McGurk and McDonald reported a dramatic illusion: when watching a talking head, in which repeated utterances of the syllable [ba] are dubbed on to lip movements for [ga], normal adults report hearing [da]. Those who recommended this example to me mentioned that the mismatching of lips and voices arose through a dubbing error, and there was even the idea that a technician was disciplined for mixing up the tapes, but I've not found a source for that story. I noted with interest that the Nature paper reporting the findings does not contain a single p-value.
Suggested by: @criener @neuroconscience @DrMattDavis 

8. Thatcher illusion, 1980 
Peter Thompson kindly sent me an account of his discovery of the Thatcher Illusion (downloadable from here, p. 921). His goal had been to illustrate how spatial frequency information is used in vision, entailing that viewing the same image close up and at a distance will give very different percepts if low spatial frequencies are manipulated. He decided to illustrate this with pictures of Margaret Thatcher, one of which he doctored to invert the eyes and mouth, creating an impressively hideous image. He went to get sellotape to fix the material in place, but noticed that when he returned, approaching the table from the other side, the doctored images were no longer hideous when inverted. Had he had sellotape to hand, we might never have discovered this wonderful illusion.

Suggested by @J_Greenwood 

9. Repetition blindness, 1987 
Repetition blindness, described here by Nancy Kanwisher, is the phenomenon whereby people have difficulty detecting repeated words that are presented using rapid serial visual presentation (RSVP) - even when the two occurrences are nonconsecutive and differ in case. I could not find a clear account of the history of the discovery, but it seems that researchers investigating a different problem thought that some stimuli were failing to appear, and then realised these were the repeated ones.

Suggested by @PaulEDux 

10. Mirror neurons, 1992 
Giacomo Rizzolatti and colleagues were recording from cells in the macaque premotor cortex that responded when the animal reached for food, or bit a peanut. To their surprise, they noticed when testing the animals, the same cell that responded when the monkey picked up a peanut also responded when the experimenter did so (see here for summary). Ultimately, they dubbed these cells 'mirror neurons' because they responded both to the animal's own actions and when the animal observed another performing a similar action. The story that mirror neurons were first identified when they started responding during a coffee break as Rizzolatti picked up his espresso appear to be apocryphal.

Suggested by: @brain_apps @neuroraf @ArranReader @seriousstats @jameskilner @RRocheNeuro 

 *I picked ones that I deemed the clearest and best-known examples. Many thanks to all the people who suggested others.

Tuesday 24 May 2016

Who wants the TEF?

I'll say this for the White Paper on Higher Education "Success as a Knowledge Economy": it's not as bad as the Green Paper that preceded it. The Green Paper had me abandoning my Christmas shopping for furious tirades against the errors and illogicality that were scattered among the exhausted clichés and management speak (see here, here, here, here and here). So appalled was I at the shoddy standards evident in the Green Paper that I actually went through all the sources quoted in the first section of the White Paper to contact the authors to ask if they were happy with how their work had been reported. I'm pleased to say that out of 12 responses I got, ten were entirely satisfied, and one had just a minor quibble. But what about the twelfth, you ask. What indeed?
When justifying the need for a Teaching Excellence Framework (TEF) last November, Jo Johnson used some extremely dodgy statistical analysis of the National Student Survey to support his case that teaching in some quarters was 'lamentable'. I was pleased to see that this reference was expunged from the White Paper. But that left a motheaten hole in the fabric of the argument: if students aren't dissatisfied, then do we really need a TEF?  One could imagine the civil servants rushing around desperate to find a suitably negative statistic. And so they did, citing the 2015 HEPI-HEA Student Academic Experience Survey as showing that "Many students are dissatisfied with the provision they receive, with over 60% of students feeling that all or some elements of their course are worse than expected and a third of these attributing this to concerns with teaching quality." (p 8, para 5).  The same report is subsequently cited as showing that: ".. applicants are currently poorly-informed about the content and teaching structure of courses, as well as the job prospects they can expect. This can lead to regret: the recent Higher Education Academy (HEA)–Higher Education Policy Institute (HEPI) Student Academic Experience Survey found that over one third of undergraduates in England believe their course represents very poor or poor value for money." The trouble is, both of these quotes again use spin and dodgy statistics.
Let's take the 60% dissatisfaction statistic first. The executive summary of the report stated; "Most students are satisfied with their course, with 87% saying that they are very or fairly satisfied, and only 12% feeling that their course is worse than they expected. However, for those students who feel that their course is worse than expected, or worse in some ways and better than others, the number one reason is not the number of contact hours, the size of classes or any problems with feedback but the lack of effort they themselves put in." So how do we get to 60% dissatisfied? This number is arrived at from the finding that 12% said that their experience had been worse than expected, 49% said that it had been better in some ways and worse in others. So it is literally true that there is dissatisfaction with 'some or all elements', but the presentation of the data is clearly biased to accentuate the negative. One is reminded of Hugh in 'The Thick of It' saying "I did not knowingly not tell the truth".
But it gets worse: As pointed out on the Wonkhe blog, among 'key facts' in a briefing note accompanying the White Paper, the claim was reworded to say over 60% of students said they feel their course is worse than expected. The author of the blogpost referred to this as substantial misrepresentation of the survey. This is serious because it appears that in order to make a political point, the government is spreading falsehoods that could cause reputational damage to Universities.
Moving on to perceptions of 'value for money', there are two reasons for giving this low ratings  - you are paying a reasonable amount for something of poor quality, or you are paying an unreasonable amount for something of good quality. Alex Buckley, one of the authors of the report replied to my query to say that while the numeric data were presented accurately, crucial context was omitted. This made it crystal clear it was the money side of the equation that concerned students. He wrote:
"Figure 11 on page 17 of the 2015 HEPI-HEA survey report shows that students from England (paying £9k) and students from Scotland studying in Scotland (paying no fees) have very different perceptions of value for money. And Figure 12 shows that the perceptions of value for money of students from England plummeted at the time of the increase in fees. Half of 2nd year students from England in 2013 thought they were getting good or very good value for money. In 2014, when 2nd years were paying £9k, that figure was a third. (Other global perceptions of quality - satisfaction etc. - did not change). There is something troubling about the Government citing students' perceptions of value for money as a problem for the sector, when they appear to be substantially determined by Government policy, i.e. the level of fees. The survey suggests that an easy way to improve students' perceptions of the value for money of their degree would be to reduce the level of fees - presumably not the message that the Government is trying to get across."
So do students want the TEF? All the indicators say no. Chris Havergal wrote yesterday in the Times Higher about a report by David Greatbatch and Jane Holland in which students in focus groups gave decidedly lukewarm responses to questions about the usefulness of TEF. Insofar as anyone wants information about teaching quality, they want it at the level of courses rather than institutions, but, as an ONS interim review pointed out, the data is mostly too sparse to reliably differentiate among institutions at the subject level. Meanwhile, the NUS has recommended boycotting the National Student Survey, which forms a key part of the metrics to be used by TEF.
This is all rather rum, given that the government claims its reforms will put students at the heart of higher education. It seems that they have underestimated the intelligence of students, who can see through the weasel words and recognise that the main outcome of all the reforms will be further increases in fees.
It's widely anticipated that fees will rise because of the market competition that the White Paper lauds as a positive stimulus to the sector, and it was clear in the Green Paper that one goal of the reforms was to tie the TEF to a regulatory mechanism that would allow higher fees to be set by those with good TEF scores. Perhaps less widely appreciated is that the plan is for the new Office for Students to be funded largely by subscriptions paid by Higher Education Providers. They will have to find the money somewhere, and the obvious way to raise the cash will be by raising fees. So students will be in the heart of the reforms in the sense that having already endured dramatic rises in fees and loss of the maintenance grant, they will now also be picking up the bill for a new regulatory apparatus whose main function is to satisfy a need for information that they do not want.

Saturday 7 May 2016

Would paying by results improve reproducibility?

Twitter and Facebook were up in arms last week. "Merck wants its money back if University research is wrong" was the headline to the article that set off the outrage.

Commentators various described the idea as dangerous, preposterous and outrageous, and a 'worrying development', while at the same time accusing Merck of hypocrisy for its history of misleading claims about its vaccines and drugs.

In the comments beneath the article, similar points were made: 'No way any academic institutions will agree to this'; 'If you want absolute truth take up religion'; 'Merck wants risk-free profit'.

But if you follow the link to the article that the story was based on, it's clear that the headline in the Technology Review piece was misleading.

For a start, the author of the piece, Michael Rosenblatt, was clear that he was not representing an official position. Under 'competing interests' we are told: "M.R. is an employee of and owns stock and stock options in Merck & Co. The opinions expressed in this article are those of the author and do not necessarily correspond to the views of the author’s employer."

Rosenblatt's focus is irreproducible scientific research: a topic that is receiving increasing attention in biomedicine among other disciplines. He notes the enormous waste that occurs when a pharma company like Merck attempts to use basic biomedical research to develop new drug targets. Whether or not you approve of Merck's track record, there is no question but they have legitimate concerns. He explains that the costs of building a translational project on shaky biomedical foundations are so great that pharma companies will now routinely try to replicate original findings before taking them forward. Except that, all too often, they find they cannot do so. Since it can take between 2-6 scientists one to two years to conduct a replication, the costs of irreproducible science are substantial.

Rosenblatt notes that there have been numerous suggestions for improving reproducibility of biomedical science; mostly these concern aspects of training, and altering promotion practices so that academic scientists will have an incentive to do reproducible research. His proposal is that pharmaceutical companies could also address incentives by making their financial deals with universities contingent on the reproducibility of the results.

In effect, Rosenblatt wants those who do the research in universities to ensure it is reproducible before they bring it forward for pharmaceutical companies to develop. His proposal does indeed include a 'money back' guarantee, so that if a result did not hold up, the pharmaceutical company would be compensated. But, Rosenblatt argues, this would be compensated for by additional funds made available to universities to enable them to adopt more reproducible practices, including, where necessary, replicating findings before taking them forward.

The Technology Review headline misses all this nuance and just implies that Merck is being naïve and unrealistic in assuming that scientists have some kind of pre-cognition about the outcomes of experiments. That is far from being the case: the emphasis is rather on the importance of ensuring that findings are as solid as possible before publishing them, and suggesting that pharma companies could play a role in incentivising reproducible practies by tweaking their funding arrangements with Universities.

Obviously, the devil would be in the detail, and Rosenblatt is careful to avoid being too specific, suggesting instead that this is not intended as a panacea, but rather as an idea that would be worth piloting. I'm as cynical as the next person about the motives of pharmaceutical companies, but I have to say I think this is an interesting suggestion that could be in the interests of scientific progress, as well as benefiting both universities and pharmaceutical companies.