Sunday, 11 August 2013

The arcuate fasciculus and word learning: a critique

The arcuate fasciculus is a white matter tract linking areas in the temporal lobe involved in interpreting speech with areas in the frontal lobe that control motor movements. Its role in language was established years ago when it was proposed that conduction aphasia, characterised by poor repetition despite good understanding and fluent spontaneous speech, was a disconnection syndrome resulting from lesions of the arcuate fasciculus.

Compared with apes and monkeys, humans have much stronger structural connections between temporal and frontal regions of the brain, suggesting that evolution of the arcuate fasciculus played a key role in language evolution.

Study of white matter tracts in the brain has advanced rapidly since the advent of diffusion tensor imaging (DTI). DTI makes it possible to measure parameters such as fractional anisotropy and radial diffusivity, indirect measures of myelination and/or axonal density within white matter.

Use of DTI has revealed an intriguing aspect of the arcuate fasciculus: it shows wide individual variation. In most people, the left arcuate fasciculus is larger than the right, but in some a more bilateral pattern is seen, and in others, a right arcuate fasciculus may not be visible on DTI. This immediately raises the question of whether this individual variation corresponds to functional differences in language ability. Two studies considered whether the degree of lateralisation of the arcuate fasciculus related to language level, but they obtained conflicting results. Lebel and Beaulieu (2009) found that laterality of the arcuate fasciculus, measured on diffusion tensor imaging, was modestly correlated (r = 0.32) with receptive vocabulary in 68 children, with the highest scores for those with strong left lateralization. However, a study of adults found no relation between left lateralization of the arcuate fasciculus and vocabulary; instead, higher verbal memory was found to be associated with weak lateralisation.

A couple of weeks ago, López-Barroso et al published a paper in the Proceedings of the National Academy of Sciences claiming that structural and functional  measures of the left arcuate fasciculus predicted word learning ability. The authors started with 27 young adults who had brain scans that yielded measures of structural and functional connectivity between temporal and frontal language areas of the brain. Twenty of these individuals also did a learning task while in the scanner. They heard a rapid sequence of novel words, each consisting of three syllables, and were asked to concentrate on them, as they would be asked to recognise them later. After this learning phase, they were presented with the same nonwords mixed in with other nonwords made from the same syllables in a different order, and were asked to make a left or right keypress to indicate if each item was familiar or not. Their responses were transformed into a measure called d-prime, which indicates how well the person discriminates between familiar and unfamiliar items.
Figure 1A from López-Barroso et al, showing the learning task 

From previous research, one might have expected to see an association between nonword learning and lateralisation of the arcuate fasciculus. This was not found, but accuracy in the nonword learning task was significantly correlated with structural and functional measures of strength of connectivity in the left hemisphere. The authors’ conclusion is given in the title of the paper: “Word learning is mediated by the left arcuate fasciculus”.

Given what we know about the arcuate fasciculus, this is a plausible finding, but how robust is the evidence? I think there are at least three problems with this study, which lead me to be cautious about accepting its claims.

First, there is the perennial problem of multiple comparisons. The authors considered three different DTI measures (number of streamlines, fractional anisotropy and radial diffusivity) for left and right sides of four tracts (arcuate long, arcuate anterior, arcuate posterior, and inferior fronto-occipital fasciculus). They used, however, a Bonferroni correction appropriate for 8 correlations (p = .0062) rather than for 24 correlations (.002).  None of the reported correlations is significant if the appropriate correction is used.

Second, the authors emphasised that the correlation between word learning and radial diffusivity was significant only for the direct arcuate tract in the left hemisphere. This, however, confuses difference in significance levels with significance of differences: as Nieuwenhuis et al (2011)  remarked: "when making a comparison between two effects, researchers should report the statistical significance of their difference rather than the difference between their significance levels". Table 1 shows the correlations of radial diffusivity with nonword learning for different regions, with 95% confidence intervals added, and it is clear that there is overlap between these. In other words, these correlations do not differ significantly from one another. See here for further discussion of these issues.
Table 1: Correlations (r) between nonword learning and radial diffusivity in different pathways, with 95% confidence intervals
In this study, the problem is compounded by the fact that different subsets of individuals are included in the correlations for different brain regions. It is not unusual to have to exclude participants from DTI studies because of measurement difficulties, but this does mean that when comparing one brain region with another one is not comparing like with like. And since statistical significance depends on sample size, if this varies from brain region to brain region, this further complicates interpretation. This is evident from Figures 2 and 3 of the López-Barroso et al paper; in both cases the absolute value of the correlation is .42, yet for radial diffusivity of the right posterior segment, this is dismissed as nonsignificant (with N = 19), whereas for the  fMRI analysis it is heralded as significant (with N = 25).

To establish what results would look like if the same subset of participants was used in all analyses, I requested the raw data for radial diffusivity from the first author, who kindly provided it. There were just 13 participants with DTI data for all brain regions: if analysis was restricted to them, then just one of the correlations with word learning was significant by the authors' criterion, that with the right posterior arcuate fasciculus (r = .73, p = .005). This analysis does not prove that this pathway is important: rather, it emphasises that a similar pattern of associations is seen in all pathways, and the study  is underpowered to detect reliable associations, particularly if the interest is in selective associations with one pathway and not another.

Perhaps of greatest concern, though, is the measure of ‘word learning’. For a start, this was not word learning in the usual sense, as the participants were not required to associate speech sounds with meanings. Instead, they had to recognise familiar strings of meaningless sounds. There is a serious oddity about the results. Measures of d-prime usually range from zero (no ability to discriminate familiar from unfamiliar items, i.e. chance performance) to 2 or 3 (highly significant ability to discriminate familiar from unfamiliar items). But in this study, five of the twenty participants obtained negative values of d-prime. A negative value means performance is below chance: i.e., the person was more likely to treat the unfamiliar items as familiar, and vice versa. This is frankly weird, and makes one wonder whether some participants simply got confused about which key corresponded to which response. The authors give a different explanation: “Negative values indicate discrimination is achieved but individuals segmented incorrectly, classifying nonwords as words of the artificial language.” I find this unconvincing, as it would only make sense if the distractor items were made by taking sequences from the original input that crossed word boundaries: this does not seem to have been the case. But even if it were the explanation, does it make sense to treat those who discriminate the nonwords, but segment them wrongly, as doing worse on word learning than those who don’t discriminate the nonwords at all?

Does this matter? I re-ran the correlations excluding four participants with a negative d-prime value of less than -0.42 (which as far as I can work out corresponds to below chance performance). The correlations no longer reached conventional levels of statistical significance, and the largest value was now for a right-sided pathway. This is pretty meaningless, however, because the sample size, already small, becomes so tiny that one cannot do an adequately powered test of the association. The best one can say is that ‘further data are needed’.

I hope the authors will look further at this issue, as the role of the arcuate fasciculus in language  learning is fascinating and potentially important. One possibility would be to look at the associations between vocabulary level and analogous connectivity measures in the sample of 50 adults reported by Catani et al (2007), where the same DTI methods were used.

After I had drafted this critique, I Googled to see if anyone else had blogged about this study. I didn’t find blogs, but I did find extensive media coverage. I was astonished to see that, in discussing implications of this study, one of the authors, Marco Catani, a respected expert in tractography, appeared to be channeling Susan Greenfield. He was quoted as claiming that children’s vocabularies will be restricted by their use of iPads. The newspapers have picked up on these quotes, coming out with headlines such as: “Experts say too much time is spent learning via tablets and computers. Children's vocabulary could be stunted because they listen to teachers and parents less.”  For further sensationalist and misleading accounts, see here and here.

Just to be clear, this was a study looking at structural and functional brain connectivity in relation to a task that involved extracting syllabic patterns from auditory input. It did not feature children, vocabulary learning or iPads.

It really does a disservice to families of children with language learning problems to come out with scaremongering claims about modern technology on the basis of no hard evidence. And, for the record, auditory input is not the only way to learn new words: reading provides an  increasingly important route for vocabulary learning as children grow older.

Reference López-Barroso D, Catani M, Ripollés P, Dell'acqua F, Rodríguez-Fornells A, & de Diego-Balaguer R (2013). Word learning is mediated by the left arcuate fasciculus. Proceedings of the National Academy of Sciences of the United States of America, 110 (32), 13168-73 PMID: 23884655

1 comment:

  1. I saw the newspaper reports of this paper and made a note to look it up this week, so I appreciate this 'primer'. I have nothing to say about the paper at this stage, but I do think that you raise a good point around our engagement with the media in work of this nature. Perhaps as a result of the impact agenda, "media coverage of one's research" is now a criterion for promotion up the professorial scale in my own institution, and I would be surprised if we were alone in this. In order to get our institutional name out there, we are encouraged to work with our media office to put out press releases of our findings, in the hope that the newspapers will bite. But anyone who has tried to work with a press office knows that unless you can make claims about e.g. this work leading to a "cure" for dyslexia, they aren't usually interested. Because of the incentive structure, you end up approving a press release that you are slightly uncomfortable with, and on the off chance the newspapers do pick it up, you can be sure that it will end up much more sensationalist then it originally was ...

    You talk a lot on this blog about perverse incentives these days, which harm the cause of doing good science. My point is that while outreach is definitely valuable (and I love your RALLI page, for example), I do think that some of the incentives around engagement with the media end up doing a great disservice to the general public, as you have shown in this case... and I do feel sorry for poor Marco Catani who must be deeply embarrassed about these headlines.