Why English is just so darned difficult to decode - a short history...

The Ape
Mar 31, 2020
9 min read

Updated: Oct 25, 2020

The eventual decline of the Roman empire resulted in a waning of the dominance of Latin across Europe as it retreated to Rome along with its armies, its legislators and its administrators. Local, tribal languages that had been submerged but not overwhelmed by the Roman tongue were free to re-emerge, expand and dilute the dominance of Latin (Crystal, 1995).

Their time in the sun was, nonetheless, short-lived. The retreat of a dominant, longstanding and stable military and administrative superpower created power vacuums across northern Europe. These vacuums were filled by new, aggressive, expansionist military powers, and with their successful military victories came their languages and a further diluting influence on regenerating local tongues.

Nowhere was this more evident than in Great Britain (Bragg, 2003).

The development of English, its vast lexicon, the difficulties encoding its diverse cocktail of sounds derived from many languages, hundreds of years of multifarious, random and unaccountable spellings all shoehorned into an alphabetic code of just twenty-six symbols have resulted in a multi-layered, phonetic complexity that can appear chaotic and bewildering. Modern English has the most complex phonetic code of any alphabetic language (McGuinness, 1997). It takes a child, on average, three and a half years to learn to decode English compared to the four months it takes to learn to decode the almost entirely phonetic languages of Italian and Spanish (Dehaene, 2016). Italy reports considerably fewer cases of dyslexia and Italians who struggle with reading are far advanced in decoding their language when compared to their English-speaking counterparts (Paulesu, 2001). This complexity is almost entirely a result of indiscriminate twists of historical fate.

So where did it all go wrong? If the Romans had remained in Britain, today we may well be speaking something close to the phonetically logical modern Italian. However, in the fifth century, with the northern tribes of the Scots and the Picts rampaging southwards, the Britons appealed to the Romans for protection. The Romans, who had deserted Britain to protect Rome from their own barbarian invasion had no capacity or resources for assistance. The Celtic tribes then appealed to the nations of north-eastern Europe and received support from the Saxons, the Angles and the Jutes. These nations fulfilled their military brief; and some. Liking what they found in fertile Britain, they decided to remain, and in the words of the Venerable Bede (2013, p46) they ‘…began to increase so much that they became terrible to the natives themselves that had invited them.’ The Celtic tribes once again found themselves dominated by a militarily superior and administratively adept interloper. The Britons fought back, but the imposition of Anglo-Saxon power was never in doubt as wave after wave of immigrants crossed the North Sea to join the fight. Over a period of a hundred years the Germanic tribes had settled as far north as the highlands and had either destroyed or pushed back to the very edges of the island the Celtic tribes. The invaders now referred to the Celts as ‘foreigners’ (Crystal, 1995). The Celts referred to all of the invaders with the moniker ‘Saxons’. These Saxons brought with them a language that would subsume and dominate the reviving Celtic tongue. By the end of the fifth century the foundation was established for the emergence of the English language (Crystal, 1995).

The encoding of Anglo-Saxon had been achieved through a runic alphabet that the invaders brought with them from Europe (Crystal, 2012). This was perfectly serviceable until the arrival of Augustine and the Roman Christian missionaries who required the recording of letters, administrative documents, prayers and homilies in this new language they encountered (Crystal, 2012). The runic alphabet, with its associations with mysticism, charms and pagan practices was anathema to the purposes of these early Christian monks. They decided to use the Roman alphabet with its twenty-three letters which had served the Old World more than adequately for many hundreds of years (Wolman, 2014). And here is where the problems begin.

Anglo-Saxon had at least thirty-seven phonemes. To enable the encoding and decoding of these sounds either required more letters, a utilisation of a combination of letters to represent phonemes or some letters representing more than one sound (Crystal, 2014). The early Christian monks were not linguists and they decided to utilise all of the strategies, adding four new letters to the Roman alphabet to accommodate the dichotomy of sound to symbol correspondence. Thus, two of the fundamental weaknesses and complexities of decoding modern English were embedded into Old English: some letters represented more than one sound and some sounds could be represented by more than one letter (McGuiness, 1999).

Nonetheless, although complex, the system was functional and by the ninth century, through the rise of the Kingdom of Wessex in southern England under Alfred the Great, had started to gain a standardised traction (Grant, 1986). Following Alfred’s defeat of the Danes his power was assured. But Alfred was more than a great warrior. Stung by the loss of a linguistic heritage through the Vikings’ attempts to destroy the developing English language by book-burning, Alfred endeavoured to have the great written works translated into English (Smyth, 1995). Furthermore, he made a commitment to education in the language and of the language, writing that young freemen in England should be taught ‘how to read written English well’ (Bragg, 2011). As a result, the monks at Alfred’s power base in Winchester translated a multitude of religious and legal writings into the evolving language simultaneously standardising much of the orthography and logography of Old English (Crystal, 2014).

By the time Harold Godwinson was crowned King in 1066, Anglo Saxon English was well-established across much of what is England today. It was a largely phonetic language that followed the generalised rule of sound to spelling correspondence. As a result, it was relatively simple to decode and encode. Had modern English evolved directly from it, children today would find learning to read English a far simpler and faster proposition. However, another twist of historical linguistic fate awaited the language.

The Normans invaded.

Perhaps if Harold had not had to defeat the Vikings at the battle of Stamford Bridge and march two hundred and fifty miles south in two weeks, his army would have had the energy to repel the Normans (Wolman,2014), and we would today teach our children to decode a phonetic language in the four months it takes the Finns to teach their children (Dehaene, 2006). However, Harold was defeated, William I was crowned king, French became the official language of England and the seeds of linguistic complexity were sown thus ensuring it would take English speaking children in the twenty-first century three and a half years to learn to decode the most complicated alphabetic language on earth (Dehaene, 2006).

Had French subsumed and subjugated the language of Old English in the manner that the Normans overwhelmed the Saxons, then French would have become the universally spoken tongue and much phonetic complexity would have been avoided. But the Norman conquerors were happy to keep State, Church and people separate and, in the centuries following the Norman conquest, England became a trilingual country (Crystal, 2012). French was the language of law and administration; Latin the language of the Church and English the language of the street and the tavern (Bragg, 2003). As a result, Old English, the language of the oppressed, started to evolve and shift as it absorbed and mutated thousands of French words and their spellings. The result was a subverted but enriched language that became almost unrecognisable from Old English (Wolman, 2014); a language that, nonetheless remained the language of the oppressed. With such low social status, the language stood little chance of survival.

Upward social mobility for English came in the form of a flea.

In 1350 almost a third of the population of England were wiped out by the Black Death (Gottfried, 2010). Areas of high population density, towns and church communities suffered particularly with the result that the upper echelons of society and church were decimated. The majority of survivors were those living in isolation in rural settings: the dregs of society (Bragg, 2003); and they spoke English. This had a two-fold influence on the integration of English into the ruling classes. Firstly, the Norman masters began taking English-speaking wives whose children learned both English and French. Secondly, the English-speaking peripheral class started to buy and farm the cheap land that flooded the market as a result of the mass mortality (Bragg, 2003). As a consequence, English first leaked into the ruling class and then flooded into the upper strata of society. By the late fourteenth century Richard II was conducting state affairs in English, Chaucer had written the bestselling Canterbury Tales in English and in 1385 John de Wycliffe translated the Bible into the language. English was no longer the vulgar language of the underclass. It was here to stay.

The language gained further traction through its adoption and use by the Chancery court at Westminster (Wolman, 2014) who needed to communicate the formal orders of the King’s court in the language of those for whom it was intended: English. This more formal utilisation required a greater standardisation of the language and a more consistent orthography to ensure the documentation was imbued with sufficient regal authority and for ease of replication. That meant that a far greater regularity in the spelling of words was required; ‘receive’ had forty alternative iterations (Scragg, 2011). This was the start of formalisation of the encoding of this mongrel of a language and the essence of its modern complex code. These decisions were, nonetheless, not being made by linguists with an understanding of the etymology of the words they were standardising, but by overworked scribes seeking quick solutions often based on personal whim and anecdotal evidence (Crystal, 2014). Nonetheless, the amount of documentation available for reading was minimal as it was all hand-written, and, as a result, very few members of society had any course to read. Whilst this was the case, standardisation of the encoding of Middle English was within the control of a few influential scribes in Westminster. They were not to hold the power for long.

Had the standardisation of English spelling remained in the hands of these few scribes then a regularity may have evolved rapidly, but when William Caxton imported Guttenberg’s printing technology to London in the late fifteenth century the Middle English alphabetic code was thrown into chaos (Barber, Beal and Shaw, 2013).

Writing could now be reproduced on an industrial scale by the setting of type and its printing on to paper. Thus, decisions as to spelling and orthography fell to the typesetters. Their focus was not on accuracy but on speed, economy and the aesthetic of the page (Wolman, 2014). Readers preferred justified type which meant type-setters often omitted letters at the end of a line to ensure visual appeal (Wolman, 2014). Furthermore, the vast majority of the early typesetters were Flemish immigrants who had English as a second or third language. With pronunciation of words varying across counties and districts, and setters working at speed with tiny metal type, the same word could be spelled differently on the same page (Barber, Beal and Shaw, 2013). Caxton, as the forerunner, became the arbiter of standardised spelling, but he was not a linguist and took a ‘best fit’ approach to the crystallisation of the orthography. He did, nonetheless, contribute greatly to the initial calibration of English spelling (Bragg, 2003).

As books became more widely available and read among a growing literate class, printing presses sprung up in all major towns and English spelling became more and more erratic further aggravated by the multiplicity of regional accents and thus pronunciation. This was exacerbated by the renaissance and the view that Latin and French etymology, with their richer literary traditions and history, were superior to middle English. Thus, the Latin roots of some words were preferred over their Saxon derivations. Old English differentiated long and short vowel sounds through double consonants preceding the vowel, but Latin roots (with single letter to sound correspondence) did not require the differentiation (Crystal, 2014). Numerous scribes and printers thus eschewed the parvenu double consonant in preference to the ‘superior’ single consonant. The need for the untangling of the English alphabetic code became more and more urgent and demanded.

Standardisation of spelling and pronunciation was attempted but with little success. In 1569 London lawyer John Hart published his orthography in an attempt to establish a consistency in spelling. Privileging pronunciation, he attempted to classify ‘proper speech’, but his attempts gained little traction as ‘proper’ English was not the English spoken by many in the country and hence ignored. Without authority, standardisation could never tame the shape-shifting beast that was the English language. The power was in the hands of the printers and they were beholden to their owners and their readers. As such, the sixteenth century and seventeenth century language shapers had no institutional authority, unlike the French and Italians who had academies to authorise and solidify spelling and pronunciation. John Dryden’s attempts to establish consistency through the creation of an English academy failed and by the eighteenth century the English language was, in the words of Samuel Johnson (2016, p134), ‘in a state of anarchy’. It was the work of this remarkable scholar that set the English language on the road to alphabetic consistency, tethered the language with a hawser of standardisation and established the regularity required for effective instruction. In terms of today’s pedagogy relating to the teaching of the reading and writing of English it was a Copernican moment.

Johnson’s stratagem had three strands: to standardise spelling; to standardise pronunciation and to provide categorical definitions of words. Untangling the complexities of the phonic coding that had erupted over the centuries and simplifying it to ensure the language could be easily deciphered and taught was not a priority. It was not even a consideration. When Johnson was uncertain of a standardised spelling, he always reverted to the original derivation and etymological root of the word rather than to its simplest phonic form. Nonetheless, this was at least a consistent approach. In eight years, Johnson produced a dictionary containing 43,000 words; an undertaking that took a cadre of scholars at the French academy eighty years to achieve with their language. English today, spoken, read and written across the world, has drifted remarkably little since then. The English phonic code, although excruciatingly complex, convoluted and complicated had at last been anchored. It would take many more years before it was identified, acknowledged, unravelled, stratified and presented in a format that could ensure that it could be effectively taught.

The Reading Ape

Nullius in verba

Why English is just so darned difficult to decode - a short history...