Friday 20 April 2012

Getting genetic effect sizes in perspective

My research focuses on neurodevelopmental disorders - specific language impairment, dyslexia, and autism in particular. For all of these there is evidence of genetic influence. But the research papers reporting relevant results are often incomprehensible to people who aren’t geneticists (and sometimes to those who are).  This leaves us ignorant of what has really been found, and subject to serious misunderstandings.
Just as preamble, evidence for genetic influences on behaviour comes in two kinds. The first approach, sometimes referred to as genetic epidemiology or behaviour genetics allows us to infer how far genes are involved in causing individual differences by studying similarities between people who have different kinds of genetic relationship. The mainstay of this field is the twin study. The logic of twin studies is pretty simple, but the methods currently used to analyse twin data are complex. The twin method is far from perfect, but it has proved useful in helping us identify which conditions are worth investigating using the second approach, molecular genetics.
Molecular genetics involves finding segments of DNA that are correlated with a behavioural (or other phenotypic) measure. It involves laboratory work analysing biological samples of people who’ve been assessed on relevant measures. So if we’re interested in, say, dyslexia, we can either look for DNA variants that predict a person’s reading ability - a quantitative approach - or we can look for DNA variants that are more common in people who have dyslexia. There’s a range of methods that can be used, depending on whether the data come from families - in which case the relationship between individuals can be taken into account - or whether we just have a sample of unrelated people who vary on the behaviour of interest, in this case reading ability.
The really big problem comes from a tendency in molecular genetics to focus just on p-values when reporting findings. This is understandable: the field of molecular genetics has been plagued by chance findings. This is because there’s vast amounts of DNA that can be analysed, and if you look at enough things, then the odd result will pop up as showing a group difference just by chance. (See this blogpost for further explanation). The p-value indicates whether an association between a DNA variant and a behavioural measure is a solid finding that is likely to replicate in another sample.
But a p-value depends on two things: (a) the strength of association between DNA and behaviour (effect size) and (b) the sample size. Psychologists, many of whom are interested in genetic variants linked to behaviour, are mostly used to working with samples that number in the tens rather than hundreds or thousands. It’s easy, therefore, to fall into the trap of assuming that a very low p-value means we have a large effect size, because that’s usually the case in the kind of studies we’re used to. Misunderstanding can arise if effect sizes are not reported in a paper.
Suppose we have a genetic locus with two alleles, a and A, and a geneticist contrasts people with an aa genotype vs those with aA or AA (who are grouped together). We read a molecular genetics paper that reports an association between these genotypes and reading ability with p-value of .001. Furthermore, we see there are other studies in the literature reporting similar associations, so this seems a robust finding. You could be forgiven for concluding that the geneticists have found a “dyslexia gene”, or at least a strong association with a DNA variant that will be useful in screening and diagnosis. And, if you are a psychologist, you might be tempted to do further studies contrasting people with aa vs aA/AA genotypes on behavioural or neurobiological measures that are relevant for reading ability.
However, this enthusiasm is likely to evaporate if you consider effect sizes. There is a nice little function in R,, that allows you to compute effect size easily if you know a p-value and a sample size. The table below shows:
  •  effect sizes (Cohen’s d, which gives mean difference between groups in z-score units)
  •  average for each group for reading scores scaled so the mean for the aA/AA group is 100 with SD of 15
Results are shown for various sample sizes with equal numbers of aa vs aA/AA and either p =.001 or p = .0001. (See reference manual for the R function for relevant formulae, which are also applicable in cases of unequal sample size). For those unfamiliar with this area, a child would not normally be flagged up as having reading problems unless a score on a test scaled this way was 85 or less (i.e., 1 SD below the mean).

Table 1: Effect sizes (Cohen’s d) and group means derived from p-value and sample size (N)

When you have the kind of sample size that experimental or clinical psychologists often work with, with 25 participants per group, a p of .001 is indicative of a big effect, with a mean difference between groups of almost one SD. However, you may be surprised at how small the effect size is when you have a large sample. If you have a sample of 3000 or so, then a difference of just 1-2 points (or .08 SD) will give you p < .001. Most molecular genetic studies have large sample sizes. Geneticists in this area have learned that they have to have large samples, because they are looking for small effects!
It would be quite wrong to suggest that only large effect sizes are interesting. Small but replicable effects can be of great importance in helping us understand causes of disorders, because if we find relevant genes we can study their mode of action (Scerri & Schulte-Korne, 2010). But, as far as the illustrative data in Table 1 are concerned, few psychologists would regard the reading deficit associated at p of .001 or .0001 with genotype aa as of clinical significance, once the sample size exceeds 1000 per group.
Genotype aa may be a risk factor for dyslexia, but only in conjunction with other risks. On its own it doesn’t cause dyslexia.  And the notion, propagated by some commercial genetics testing companies, that you could use a single DNA variant with this magnitude of effect to predict a person’s risk of dyslexia, is highly misleading.

Further reading
Flint, J., Greenspan, R. J., & Kendler, K. S. (2010). How Genes Influence Behavior: Oxford University Press.
Scerri, T., & Schulte-K├Ârne, G. (2009). Genetics of developmental dyslexia European Child & Adolescent Psychiatry, 19 (3), 179-197 DOI: 10.1007/s00787-009-0081-0

If you are interested in analysing twin data, you can see my blog on Twin Methods in OpenMx, which illustrates a structural equation modelling approach in R with simulated data.

Update 21/4/12: Thanks to Tom Scerri for pointing out my original wording talked of "two versions of an allele", which has now been corrected to "a genetic locus with two alleles"; as Tom noted: an allele is an allele, you can't have two versions of it.
Tom also noted that in the table, I had taken the aA/AA genotype as the reference group for standardisation to mean 100, SD 15. A more realistic simulation would take the whole population with all three genotypes as the reference group, in which case the effect size would result from the aA/AA group having a mean above 100, while the aa group would have mean below 100. This would entail that, relative to the grand population average, the averages for aa would be higher than shown here, so that the number with clinically significant deficits will be even smaller.
I hope in future to illustrate these points by computing effect sizes for published molecular genetic studies reporting links with cognitive phenotypes.

Thursday 12 April 2012

The ultimate email auto-response

Prague Humor
Photo credit: szeke

Andy Field (who actually gave me a v. helpful response to a query via Twitter...) demonstrates how it should be done:

12 April 2012 06:34
This is an automatic reply.

I'm on study leave writing 'Discovering Statistics Using SPSS 4'. This essentially means that I'm locked in a mental Dungeon for the next 6 months in which intrusions from the outside world are like needles lancing my brain. They hurt, and hence I'm going to ignore them. If you really need to get hold of me then you should write a letter and insert it into the stale bread that they push through my cell door every morning. Or you can follow my demented ramblings (or 'progress' as some people call it) on Facebook and Twitter.

I will start to emerge back into reality (some might argue I was never in it) sometime in April 2012, at which point please resend your email if you still require a response.

Monday 9 April 2012

BBC's 'extensive coverage' of the NHS bill

Last month, there was a remarkable disconnect between what was being reported on BBC News outlets and what was concerning many members of the public on social media. The Health and Social Care Bill was passed by Parliament on 21st March, despite massive objections from many of those working in the NHS, and those members of the general public who were aware of the bill. Evidence of this concern was apparent from the fact that a petition with 486,000 signatures was presented to the Lords by Lord David Owen on 19th March, supporting his view that consideration of the Bill should be deferred until after the Risk Register had been published. There had also been a rally on 7th March attended by thousands of NHS workers. During the month of March, when there was still an opportunity of killing the bill if the Liberal Democrats had come out against it, there appeared to be very little coverage of it by the BBC. Only after the Bill had been passed, did the BBC seem willing to run it as a news item.
There has been a fair bit of commentary on the lack of coverage, with some suggesting there may have been a deliberate conspiracy to keep quiet because of political pressures and/or vested interests of BBC executives in private health providers.
Like many people, I submitted a complaint to the BBC about their lack of coverage of this important topic. I received a prompt reply as follows:
Dear Dorothy 
Thank you for contacting us regarding BBC News coverage. We understand you believe BBC News did not sufficiently report on the opposition to the Health and Social Care Bill. BBC News has reported extensively on the opposition to the Health and Social Care Bill across our news programmes and bulletins since the Bill was originally proposed. We have reported on the health, political and business dimensions of the debate during our flagship news programmes and news bulletins and have heard from politicians, NHS workers, public sector workers and members of the public alike, as well as from supporters of the bill. There have been numerous protests and demonstrations held in opposition to the Government’s proposals. Such shows of opposition have been varied in size and were spread across the different stages of the bill’s formation. We believe we have accurately and fairly reflected the nature of this opposition in our news coverage. While you were unhappy about the level of coverage given to this, the political opposition to the Bill culminated in the House of Commons emergency debate on 20 March. Accordingly, the Commons debate featured heavily in our news coverage on the day and was the lead story during our main news bulletins. The Health and Social Care bill has been one of the biggest UK stories over the past few months and we believe we have afforded it the appropriate level coverage in a fair and impartial manner, allowing viewers and listeners to make up their own minds on the matter at hand
The phrase ‘extensive coverage’ did not reflect my impressions. I am not glued to the BBC, but I am a regular listener to the Radio 4 Today Programme and I was not aware of the NHS bill receiving any coverage at all. I therefore submitted a follow-up complaint asking if they could please give me details of specific programmes when the BBC had covered the NHS bill during the month of March. Again, they replied promptly and courteously. And here is what they said (with relevant sections from the websites in blue):
Dear Dr Bishop
We understand that you would like details of when opposition to the Health and Social Care Bill was covered by the BBC. Opposition to the bill has been covered on various programmes across the BBC, for example; Newsnight, The Daily Politics, Today and BBC News Online. Opposition was also covered on 'Newsnight' during a report on 9th March which looked at Liberal Democrat activist's plans to derail the bill and on the 13th March during a discussion on the future of the welfare system. A report on the 'The Daily Politics' broadcast on 13th March (at 14:09) highlighted opposition from Labour as well as the Royal College of GPs. Diane Abbott said health professionals were still opposed to the Health and Social Care Bill, which could be days away from becoming law. She said a future Labour government would overturn the act and "unpick the worst of the damage". Liberal Democrat spokesman Lord Clement-Jones said the bill was "going to get more acceptance"
More coverage was given on the 'Today' programme on the following dates:  
Sat 10th March, 0712-0714 Liberal Democrat activists will decide this morning whether Nick Clegg will face a vote on the health bill. The BBC's Robin Bryant explains why that could be bad news for the government. 
Weds 21st March, 0840-0853 The government's controversial plans to change the NHS have passed their final hurdle in Parliament after 14 months of opposition and changes in both houses. Professor Chris Ham, chief executive of the health think-tank The King's Fund, looks at what we left with now.
In addition examples of coverage on BBC News Online include: 

So, if I have understood this right,during March, the Today Programme covered the story once, in an early two-minute slot, before the Bill was passed. Other items that morning included 4 minutes on a French theme park based on Napoleon, 6 minutes on international bagpipe day and 8 minutes on Jubilee celebrations.

free counters

Tuesday 3 April 2012

Phonics screening: sense and sensibility

There’s been a lot written about the new phonics test that is being introduced in UK schools in June. Michael Rosen cogently put the arguments against it on his blog this morning. A major concern is that the test involves asking children to read a list of items, and takes no account of whether they understand them. Indeed, the list includes nonwords (i.e. pronounceable letter strings, such as "doop" or "barg") as well as meaningful words. So children will be “barking at print” - a very different skill from reading for meaning.

I can absolutely see where Rosen is coming from, but he’s missing a key point. You can’t read for meaning if you can’t decode the words. It’s possible to learn some words by rote, even if you don’t know how letters and sounds go together, but in order to have a strategy for decoding novel words, you need the phonics skills. Sure, English is an irritatingly irregular language, so phonics doesn’t always give you the right answer, but without phonics, you have no strategy for approaching an unfamiliar word.
Back in 1990, Hoover and Gough wrote an influential paper in 1990 called “The Simple View of Reading”. This is clearly explained in this series of slides by Morag Stuart from the Institute of Education. It boils down to saying that in order to be an effective reader you need two things: the ability to decode words, and the ability to understand the language in a text. Some children can say the words but don’t understand what they’ve read. These are the ones Michael Rosen is worried about. They won’t be detected by a nonword reading test. They are all-too-often missed by teachers who don’t realise they are having problems because when asked to read aloud, they do fine. There’s a fair bit of research on these so-called “poor comprehenders”, and how best to help them (some of which is reviewed here). But there are other children with the opposite pattern: good language understanding but difficulties in decoding: this corresponds to classic dyslexia. There are decades of research showing that one of the most effective ways of identifying these children is to assess their ability to read novel letter sequences that they haven’t encountered before - nonwords. Nonword reading ability has also been shown to predict which children are at risk for later reading failure.  It's useful precisely because it tests children's ability to attack unfamiliar material, rather than testing what they have already learned. It's a bit like a doctor giving someone a stress test on a treadmill. They may never encounter a treadmill in everyday life, but by observing how they cope with it, the doctor can tell whether they are at risk of cardiovascular problems.

Some children don’t need explicit teaching of phonics - they pick it up spontaneously through exposure to print. But others just don’t get it unless it is made explicit. I’m coming at this as someone who sees children who just don’t get past first base in learning to read, and who fall increasingly far behind if their difficulties aren’t identified. A nonword reading test around age 6 to 7 years will help identify those children who could benefit from extra support in the classroom.
So that’s the rationale, and it is well-grounded in a great deal of reading research. But is there a downside? Potentially, there are numerous risks. It would be catastrophic if teachers got the message from this exercise that reading instruction should involve training children to read lists of words, or worse still, nonwords. Unfortunately, testing in schools is increasingly conflated with evaluation of the school, and so teaching-to-the-test is routinely done. The language comprehension side of reading is hugely important, and shouldn't be neglected. Developing children’s oral language skills is an important component of making children literate. It is also important for children to be read to, and to learn that books are a source of pleasure.
Another concern is children being identified at an early age as failing. The cutoff that is used is crucial, and there are concerns that the bar may be set too high.  Children at real risk are those who bomb on nonword reading, not those who are just a bit below average.
The impact on children’s self-perception is also key. There is already evidence that some primary school children are unduly stressed by SATS. There’s nothing more likely to put a child off reading than being given a test that they don’t understand and being told they’ve failed it. When I was at school, we had the 11+ examination that divided children into those who went to grammar school and those who didn’t. I had friends whose parents promised them a bicycle if they passed - even though there was precious little practice that you could do for the 11+, which was designed to test skills that had not been explicitly taught. Schoolfriends who failed were left with a chip on their shoulder for years. I’d hope that this reading screen is introduced in a more sensitive manner, but the onus is on parents, teachers and the media to ensure this happens. This screening test should serve as a simple diagnostic that will allow teachers to identify those children whose weak letter-sound-knowledge means that they could benefit from extra support. It should not be used to evaluate schools, make children feel they are failures, worry their parents, or support a sterile phonics-only approach to reading.

Connor, M. J. (2003). Pupil stress and standard assessment tasks (SATs) An update. Emotional and Behavioural Difficulties, 8(2), 101-107. doi: 10.1080/13632750300507010
Hoover, W. A., & Gough, P. B. (1990). The simple view of reading. Reading and Writing, 2, 127-160.
Nation, K., & Angell, P. (2006). Learning to read and learning to comprehend. London Review of Education, 4(1), 77–87. doi: 10.1080/13603110600574538
Rack, J. P., Snowling, M. J., & Olson, R. K. (1992). The nonword reading deficit in developmental dyslexia. Reading Research Quarterly, 27, 29-53.
Snowling, M., & Hulme, C. (2012). Interventions for children's language and literacy difficulties International Journal of Language & Communication Disorders, 47 (1), 27-34 DOI: 10.1111/j.1460-6984.2011.00081.x

free counters