|Proportions of children with different scores on phonics screen in 2012 and 2013. Dotted lines show interpolated values.|
Saturday 5 October 2013
Good and bad news on the phonics screen
Teaching children to read is a remarkably fraught topic. Last year the UK Government introduced a screening check to assess children’s ability to use phonics – i.e., to decode letters into sounds. Judging from the reaction in some quarters they might as well have announced they were going to teach 6-year-olds calculus. The test, we were told, would confuse and upset children and not tell teachers anything they did not already know. Some people implied that there was an agenda to teach children to read solely using meaningless materials. This, of course, is not the case. Nonwords are used in assessment precisely because you need to find out if the child has the skills to attack an unfamiliar word by working out the sounds. Phonics has been ignored or rejected for many years by those who assumed that if you taught phonics the child would be doomed to an educational approach that involved boring drills in meaningless materials. This is not the case: for instance, Kevin Wheldall argues that children need to combine teaching of phonics with training in vocabulary and comprehension, and storybook reading with real texts should be a key component of reading instruction.
There is evidence for the effectiveness of phonics training from controlled trials, and I therefore regard it as a positive move that the government has endorsed the use of phonics in schools. However, they continue to meet resistance from many teachers, for a whole range of reasons. Some just don’t like phonics. Some don’t like testing children, especially when the outcome is a pass/fail classification. Many fear that the government will use results of a screening test to create league tables of schools, or to identify bad teachers. Others question the whole point of screening: This recent piece from the BBC website quotes Christine Blower, the head of the National Union of Teachers, as saying: "Children develop at different levels, the slow reader at five can easily be the good reader by the age of 11.” To anyone familiar with the literature on predictors of children’s reading, this shows startling levels of complacency and ignorance. We have known for years that you can predict with good accuracy which children are likely to be poor readers at 11 years from their reading ability at 6 (Butler et al, 1985).
When the results from last year's phonics screen came out I blogged about them, because they looked disturbingly dodgy, with a spike in the frequency distribution at the pass mark of 32. On Twitter, @SusanGodsland has pointed me to a report on the 2012 data where this spike was discussed. This noted that the spike in the distribution was not seen in a pilot study where the pass mark had not been known in advance. The spike was played down in this report, and attributed to “teachers accounting for potential misclassification in the check results, and using their teacher judgment to determine if children are indeed working at the expected standard.” It was further argued that the impact of the spike was small, and would lead to only around 4% misclassification.
However, a more detailed research report on the results was rather less mealy-mouthed about the spike and noted “the national distribution of scores suggests that pupils on the borderline may have been marked up to meet the expected standard.” The authors of that report did the best they could with the data and carried out two analyses to try to correct for the spike. In the first, they deleted points in the distribution where the linear pattern of increase in scores was disrupted, and instead interpolated the line. They concluded that this gave 54% rather than 58% of children passing the screen. The second approach, which they described as more statistically robust, was to take all the factors that they had measured that predicted scores on the phonics screen, ignoring cases with scores close to the spike, and then use these to predict the percentage passing the screen in the whole population. When this method was used, only 46% of children were estimated to have passed the screen when the spike was corrected for.
Well, this year’s results have just been published. The good news is that there is an impressive increase in percentage of children passing from 2012 to 2013, up from 58% to 69%. This suggests that the emphasis on phonics is encouraging teachers to teach children about how letters and sounds go together.
But any positive reaction to this news is tinged with a sense of disappointment that once again we have a most peculiar distribution with a spike at the pass mark.
I applied the same correction as had been used for the 2012 data, i.e. interpolating the curve over the dodgy area. This suggested that the proportion of cases passing the screen was overestimated by about 6% for both 2012 and 2013. (The precise figure will depend on the exact way the interpolation is done).
Of course I recognise that any pass mark is arbitrary, and children’s performance may fluctuate and not always represent their true ability. The children who scored just below the pass mark may indeed not warrant extra help with reading, and one can see how a teacher may be tempted to nudge a score upward if that is their judgement. Nevertheless, teachers who do this are making it difficult to rely on the screen data and to detect whether there are any improvements year on year. And it undermines their professional status if they cannot be trusted to administer a simple reading test objectively.
It has been announced that the pass mark for the phonics screen won’t be disclosed in advance in 2014, which should reduce the tendency to nudge scores up. However, if the pass mark differs from previous years, then the tests won’t be comparable, so it seems likely that teachers will be able to guess it will remain at 32. Perhaps one solution would be to ask the teacher to make a rating of whether or not the test result agrees with their judgement of the child’s ability. If they have an opportunity to give their professional opinion, they may be less tempted to tweak test results. I await with interest the results from 2014!
ReferenceButler, Susan R., Marsh, Herbert W., Sheppard, Marlene J., & Sheppard, John L (1985). Seven-year longitudinal study of the early prediction of reading achievement Journal of Educational Psychology, 77, 349-361 DOI: 10.1037//0022-0618.104.22.1689