First there was the word and the word became God...but what about the letters?

The word method of teaching reading dominated instruction methods for a century with its word lists and flash cards and repeated vocabulary and culminated in the seemingly rational mutation into Goodman’s (1971) whole language approach. It was only the glaring, incontrovertible fact that the millions of children taught by this method were unable to read that finally brought this edifice crashing down – 60% of Californian nine and ten-year olds taught by this method over seven years were unable to gain an even superficial understanding of their books (Turner and Burkard, 1996) and 20% of New Zealand six-year-olds exposed to the approach required additional intervention (Chapman, Turner, Prochnow, 2001). But why was the word method so resilient and why does it still maintain traction today despite the batteries of research papers and meta-studies that contradict its fundamental assumption: that the word, not the letter, is the fundamental unit of reading instruction?

The conflation hinges on experiments carried out by James Cattell (1886), an American researcher working in Germany, in the late nineteenth century utilising tachistoscopic techniques which measured eye fixation times on letters and words (Rayner, 1989). What Cattell discovered was that in ten milliseconds a reader could apprehend equally well three or four unrelated letters, two unrelated words, or a short sentence of four words - approximately twenty-four letters (Anderson and Dearbourne, 1952). What is remarkable is that Cattell’s experiments were extremely accurate and when retested by Reicher (1969) found to be statistically reliable.

The generalisation posited from this outcome was that words are more memorable than letters, which resulted in the deduction that we do not read words by the serial decoding of individual letters: so, we must, therefore, read whole words not individual letters. If that were true, then that left phonics, with its teaching of serial phoneme decoding and blending to identify words as the uninvited interloper; the side-salad that nobody had ordered.

The assumptions derived from Cattell’s investigations seem perfectly rational and the teaching of reading through word recognition a logical conclusion. There were, nonetheless, two critical flaws.

Firstly, by focusing on the whole word, the pedagogy ignored the fundamental construct of words: the alphabet. Alphabetic language encoding associates letters and combinations of letters with sounds which are then decoded as sounds either orally or cognitively. Words are not visual templates and they cannot be recognised by shape or logographic quality (Dahaene, 2014). If they could, argues Rayner (1989), then changes in typeface, font, handwriting or case would catastrophically undermine the recognition of a word, and text written in alternating case would be undecipherable, which is not the case (Smith, Lott and Cronnell, 1969). Rayner’s (1989) postulation is further validated by McCandliss, Curran and Posner’s (1993) work on artificial alphabets and their long-term efficacy in comparison with word shape deciphering. In addition, the complexity of reading languages heavily dependent on symbol recognition indicates the inefficiency of word-shape recognition: Chinese children are expected to recognise only 3500 characters by the age eleven (Leong, 1973), and it takes twelve years of study to learn 2000 logographs in Japanese kanji (Rasinski, 2010).

But if we don’t read words by the serial processing of individual letters in order, and the brain has a limited capacity to remember sufficient words by shape, how can we read words faster than letters? There is clearly something more complex that is creating this word superiority effect (Reicher, 1969).

Automaticity of word recognition is achieved by efficient orthographic processing (Wagner and Barker, 1994). This processing is defined as the ability to form, store and access orthographic representations (Stanovich and West, 1989) allied to a reader’s knowledge about permissible letter patterns and word-specific knowledge of real word spellings (Ehri, 1980).

Thus, automatic word recognition is not reliant on word shape but on the recoding of letter patterns which results in the conclusion that orthographic processing is contingent on phonological processing (Ehri, 1997). Reading fluency is thus dependent on the acquisition of word-specific orthographic representations (Perfetti, 2007), linked to phonological, syntactic, semantic and morphological information. Once orthographic processing is viable for a reader (as a result of numerous fixations – practice in other words), the word superiority effect (Reicher, 1969) becomes evident and the counter-intuitive phenomenon of the word being perceived before its component letters is manifest (Ehrlich and Rayner, 1989). This may seem absurd but consider the mis-spelled word ‘diffrence’ that may be read as ‘difference’ before the mis-spelling is identified (Rayner, 1989).

But if the word is processed in advance of the individual letters and syntax, semantic information and context play a role in that processing, then reading instruction should surely privilege those elements and the whole language, balanced literacy, multi-cuing methods of teaching reading will be more efficacious.

This was the second critical floor in Cattell’s (1886) assumptions. The word superiority effect is only evident once automaticity is attained by a reader. Prior to this, an emergent reader can only decode using phonological processing (Stanovich, 1990). Any initial reading instruction that does not enable a reader to decode sounds in sequence and encourages continued practice of decoding will require the pupil to either decipher the code by themselves or achieve the pseudo-fluency of logographic (word shape) processing. By trying to teach automaticity before decoding mastery, the cart is being put very much before the horse.