Orthography
Encyclopedia
The orthography of a language specifies a standardized way of using a specific writing system
(script) to write the language. Where more than one writing system is used for a language, for example Kurdish
, Uyghur
, Serbian
or Inuktitut, there can be more than one orthography. Orthography is distinct from typography
.
, 13c.), from Latin: orthographia, from Greek
ὀρθός orthós, "correct", and γράφειν gráphein, "to write".
; that is, the relationship between phoneme
s and graphemes in a language. Sometimes spelling is considered only part of orthography, with other elements including hyphenation, capitalization
, word breaks
, emphasis
, and punctuation
. Orthography thus describes or defines the set of symbols (graphemes and diacritic
s) used in a language, and the rules about how to write these symbols.
Most natural language
s developed as oral languages, and writing systems have usually been crafted or adapted afterwards as representations of the spoken language. In an etic
sense, the rules for writing systems are arbitrary, which is to say that any set of rules could be considered "correct" if the users of the language mutually agreed to convene upon that set of rules as the standard
way to represent the spoken language. However, as standardization takes stronger hold, an emic epistemology of "right and wrong" develops, in which compliance with, or violations of, the standards are viewed as right, or wrong, in a way analogous to moral
right and wrong, and in which each word has a written identity that is no less standardized than its oral-aural identity, which is emically unitary. The term orthography is sometimes used in a linguistic
sense to refer to any method of writing a language, without judgment as to right and wrong, with a scientific understanding that orthographic standardization exists on a spectrum of strength of convention. But the original sense of the word stem, which evolved long before linguistic science, implies a dichotomy of correct and incorrect, and the word stem is still most often used to refer not just to a way of writing a language but more specifically to the thoroughly standardized (emically "correct") way of writing it.
(distinctive speech sound) and vice versa. An orthography may also have varying degrees of efficiency for reading or writing. For example, diverse letter, digraph
, and diacritic shapes contribute to diverse word shapes, which aid fluent reading, while heavy use of apostrophes or diacritics makes writing slow, and the use of symbols not found on standard keyboards makes computer or cell phone input awkward.
(distinctive speech sound) and vice versa, that is, graphemes and phonemes are bijective functions of one another. Russian, Spanish and Italian are close to being phonemic, and English is among the least phonemic.
orthography considers not only what is phonemic, as above, but also the underlying structure of the words. For example, in English, /s/ and /z/ are distinct phonemes, so in a phonemic orthography the plurals of cat and dog would be cats and dogz. However, English orthography recognizes that the /s/ sound in cats and the /z/ sound in dogs are the same element (archiphoneme), automatically pronounced differently depending on its environment, and therefore writes them the same despite their differing pronunciation.
German
and Russian
are morpho-phonemic in this sense, whereas Turkish
is purely phonemic.
Korean hangul
has changed over the centuries from a highly phonemic to a largely morpho-phonemic orthography, and there are moves in Turkey to make that script more morpho-phonemic as well. Japanese kana are almost completely phonemic, but has a few morpho-phonemic aspects, notably in the use of ぢ di and づ du (rather than じ ji and ず zu, which is how they are pronounced) when the character is a voicing of an underlying ち or つ – see rendaku
.
Another group of language which experiences a high rate morpho-phonemic changes is the Austronesian languages
. Oftentimes, this cause problem to foreingers who are trying to learn Philippine languages
like Tagalog
, Cebuano
, Ilocano and others. It is also the same problem of people learning Bahasa Melayu and Bahasa Indonesia.
and the phonemes in the language, such as that of English
. Most languages of western Europe (which are written with the Latin alphabet
), as well as the modern Greek language
to a lesser extent (written with the Greek alphabet
), have deep orthographies. In some of these, there are sounds with more than one possible spelling, usually for etymological
or morpho-phonemic reasons (like /dʒ/ in English, which can be written with ⟨j⟩, ⟨g⟩, ⟨dg⟩, ⟨dge⟩, or ⟨ge⟩). In other cases, there are not enough letters in the alphabet to represent all phonemes. The remaining ones must then be represented by using such devices as diacritic
s, digraph
s that reuse letters with different values (like ⟨th⟩ in English, whose sound value is normally not /t/ + /h/), or simply inferred from the context (for example the short vowel
s in abjad
s like the Arabic
and Hebrew
alphabets, which are normally left unwritten). The syllabary
systems of Japanese
(hiragana
and katakana
) are examples of almost perfectly shallow orthography – exceptions include the use ぢ and づ (discussed above) and the use of は, を, and へ to represent the sounds わ, お, and え, as relics of historical kana usage
.
Another term to describe this characteristic is "defective orthography
". This term, however, clearly implies the superiority of shallow orthographies—a point that advocates of morphophonemic writing would dispute. Using the terms "deep" and "shallow" is therefore more neutral in relation to the question of what types of orthography are superior.
, Chinese, Japanese
, and Khmer
.
Writing system
A writing system is a symbolic system used to represent elements or statements expressible in language.-General properties:Writing systems are distinguished from other possible symbolic communication systems in that the reader must usually understand something of the associated spoken language to...
(script) to write the language. Where more than one writing system is used for a language, for example Kurdish
Kurdish alphabet
The Kurdish language is written either using a variant of the Latin alphabet, according to a system introduced by Jeladet Ali Bedirkhan in 1932 , or using a variant of the Persian alphabet, the so-called Sorani alphabet, named for the city of Soran, Iraq.The Hawar is used in Turkey, Syria and...
, Uyghur
Uyghur language
Uyghur , formerly known as Eastern Turk, is a Turkic language with 8 to 11 million speakers, spoken primarily by the Uyghur people in the Xinjiang Uyghur Autonomous Region of Western China. Significant communities of Uyghur-speakers are located in Kazakhstan and Uzbekistan, and various other...
, Serbian
Serbian language
Serbian is a form of Serbo-Croatian, a South Slavic language, spoken by Serbs in Serbia, Bosnia and Herzegovina, Montenegro, Croatia and neighbouring countries....
or Inuktitut, there can be more than one orthography. Orthography is distinct from typography
Typography
Typography is the art and technique of arranging type in order to make language visible. The arrangement of type involves the selection of typefaces, point size, line length, leading , adjusting the spaces between groups of letters and adjusting the space between pairs of letters...
.
Etymology
Orthography in English comes from orthographie (FrenchFrench language
French is a Romance language spoken as a first language in France, the Romandy region in Switzerland, Wallonia and Brussels in Belgium, Monaco, the regions of Quebec and Acadia in Canada, and by various communities elsewhere. Second-language speakers of French are distributed throughout many parts...
, 13c.), from Latin: orthographia, from Greek
Greek language
Greek is an independent branch of the Indo-European family of languages. Native to the southern Balkans, it has the longest documented history of any Indo-European language, spanning 34 centuries of written records. Its writing system has been the Greek alphabet for the majority of its history;...
ὀρθός orthós, "correct", and γράφειν gráphein, "to write".
Overview
Orthography generally refers to spellingSpelling
Spelling is the writing of one or more words with letters and diacritics. In addition, the term often, but not always, means an accepted standard spelling or the process of naming the letters...
; that is, the relationship between phoneme
Phoneme
In a language or dialect, a phoneme is the smallest segmental unit of sound employed to form meaningful contrasts between utterances....
s and graphemes in a language. Sometimes spelling is considered only part of orthography, with other elements including hyphenation, capitalization
Capitalization
Capitalization is writing a word with its first letter as a majuscule and the remaining letters in minuscules . This of course only applies to those writing systems which have a case distinction...
, word breaks
Word
In language, a word is the smallest free form that may be uttered in isolation with semantic or pragmatic content . This contrasts with a morpheme, which is the smallest unit of meaning but will not necessarily stand on its own...
, emphasis
Emphasis (typography)
In typography, emphasis is the exaggeration of words in a text with a font in a different style from the rest of the text—to emphasize them.- Methods and use :...
, and punctuation
Punctuation
Punctuation marks are symbols that indicate the structure and organization of written language, as well as intonation and pauses to be observed when reading aloud.In written English, punctuation is vital to disambiguate the meaning of sentences...
. Orthography thus describes or defines the set of symbols (graphemes and diacritic
Diacritic
A diacritic is a glyph added to a letter, or basic glyph. The term derives from the Greek διακριτικός . Diacritic is both an adjective and a noun, whereas diacritical is only an adjective. Some diacritical marks, such as the acute and grave are often called accents...
s) used in a language, and the rules about how to write these symbols.
Most natural language
Natural language
In the philosophy of language, a natural language is any language which arises in an unpremeditated fashion as the result of the innate facility for language possessed by the human intellect. A natural language is typically used for communication, and may be spoken, signed, or written...
s developed as oral languages, and writing systems have usually been crafted or adapted afterwards as representations of the spoken language. In an etic
Emic and etic
Emic and etic are terms used by anthropologists and by others in the social and behavioral sciences to refer to two kinds of data concerning human behavior...
sense, the rules for writing systems are arbitrary, which is to say that any set of rules could be considered "correct" if the users of the language mutually agreed to convene upon that set of rules as the standard
Standardization
Standardization is the process of developing and implementing technical standards.The goals of standardization can be to help with independence of single suppliers , compatibility, interoperability, safety, repeatability, or quality....
way to represent the spoken language. However, as standardization takes stronger hold, an emic epistemology of "right and wrong" develops, in which compliance with, or violations of, the standards are viewed as right, or wrong, in a way analogous to moral
Morality
Morality is the differentiation among intentions, decisions, and actions between those that are good and bad . A moral code is a system of morality and a moral is any one practice or teaching within a moral code...
right and wrong, and in which each word has a written identity that is no less standardized than its oral-aural identity, which is emically unitary. The term orthography is sometimes used in a linguistic
Linguistics
Linguistics is the scientific study of human language. Linguistics can be broadly broken into three categories or subfields of study: language form, language meaning, and language in context....
sense to refer to any method of writing a language, without judgment as to right and wrong, with a scientific understanding that orthographic standardization exists on a spectrum of strength of convention. But the original sense of the word stem, which evolved long before linguistic science, implies a dichotomy of correct and incorrect, and the word stem is still most often used to refer not just to a way of writing a language but more specifically to the thoroughly standardized (emically "correct") way of writing it.
Efficiency
An orthography may be described as "efficient" if it has one grapheme per phonemePhoneme
In a language or dialect, a phoneme is the smallest segmental unit of sound employed to form meaningful contrasts between utterances....
(distinctive speech sound) and vice versa. An orthography may also have varying degrees of efficiency for reading or writing. For example, diverse letter, digraph
Digraph (orthography)
A digraph or digram is a pair of characters used to write one phoneme or a sequence of phonemes that does not correspond to the normal values of the two characters combined...
, and diacritic shapes contribute to diverse word shapes, which aid fluent reading, while heavy use of apostrophes or diacritics makes writing slow, and the use of symbols not found on standard keyboards makes computer or cell phone input awkward.
Phonemic orthography
A phonemic orthography is an orthography that has a dedicated symbol or sequence of symbols for each phonemePhoneme
In a language or dialect, a phoneme is the smallest segmental unit of sound employed to form meaningful contrasts between utterances....
(distinctive speech sound) and vice versa, that is, graphemes and phonemes are bijective functions of one another. Russian, Spanish and Italian are close to being phonemic, and English is among the least phonemic.
Morpho-phonemic orthography
A morpho-phonemicMorphophonology
Morphophonology is a branch of linguistics which studies, in general, the interaction between morphological and phonetic processes. When a morpheme is attached to a word, it can alter the phonetic environments of other morphemes in that word. Morphophonemics attempts to describe this process...
orthography considers not only what is phonemic, as above, but also the underlying structure of the words. For example, in English, /s/ and /z/ are distinct phonemes, so in a phonemic orthography the plurals of cat and dog would be cats and dogz. However, English orthography recognizes that the /s/ sound in cats and the /z/ sound in dogs are the same element (archiphoneme), automatically pronounced differently depending on its environment, and therefore writes them the same despite their differing pronunciation.
German
German language
German is a West Germanic language, related to and classified alongside English and Dutch. With an estimated 90 – 98 million native speakers, German is one of the world's major languages and is the most widely-spoken first language in the European Union....
and Russian
Russian language
Russian is a Slavic language used primarily in Russia, Belarus, Uzbekistan, Kazakhstan, Tajikistan and Kyrgyzstan. It is an unofficial but widely spoken language in Ukraine, Moldova, Latvia, Turkmenistan and Estonia and, to a lesser extent, the other countries that were once constituent republics...
are morpho-phonemic in this sense, whereas Turkish
Turkish language
Turkish is a language spoken as a native language by over 83 million people worldwide, making it the most commonly spoken of the Turkic languages. Its speakers are located predominantly in Turkey and Northern Cyprus with smaller groups in Iraq, Greece, Bulgaria, the Republic of Macedonia, Kosovo,...
is purely phonemic.
Korean hangul
Hangul
Hangul,Pronounced or ; Korean: 한글 Hangeul/Han'gŭl or 조선글 Chosŏn'gŭl/Joseongeul the Korean alphabet, is the native alphabet of the Korean language. It is a separate script from Hanja, the logographic Chinese characters which are also sometimes used to write Korean...
has changed over the centuries from a highly phonemic to a largely morpho-phonemic orthography, and there are moves in Turkey to make that script more morpho-phonemic as well. Japanese kana are almost completely phonemic, but has a few morpho-phonemic aspects, notably in the use of ぢ di and づ du (rather than じ ji and ず zu, which is how they are pronounced) when the character is a voicing of an underlying ち or つ – see rendaku
Rendaku
is a phenomenon in Japanese morphophonology that governs the voicing of the initial consonant of the non-initial portion of a compound or prefixed word...
.
Another group of language which experiences a high rate morpho-phonemic changes is the Austronesian languages
Austronesian languages
The Austronesian languages are a language family widely dispersed throughout the islands of Southeast Asia and the Pacific, with a few members spoken on continental Asia that are spoken by about 386 million people. It is on par with Indo-European, Niger-Congo, Afroasiatic and Uralic as one of the...
. Oftentimes, this cause problem to foreingers who are trying to learn Philippine languages
Philippine languages
The Philippine languages are a 1991 proposal by Robert Blust that all the languages of the Philippines and northern Sulawesi—except Sama–Bajaw and a few languages of Palawan—form a subfamily of Austronesian languages...
like Tagalog
Tagalog language
Tagalog is an Austronesian language spoken as a first language by a third of the population of the Philippines and as a second language by most of the rest. It is the first language of the Philippine region IV and of Metro Manila...
, Cebuano
Cebuano language
Cebuano, referred to by most of its speakers as Bisaya , is an Austronesian language spoken in the Philippines by about 20 million people mostly in the Central Visayas. It is the most widely spoken of the languages within the so-named Bisayan subgroup and is closely related to other Filipino...
, Ilocano and others. It is also the same problem of people learning Bahasa Melayu and Bahasa Indonesia.
Orthographic depth
A "deep" orthography is one in which there is not a one-to-one correspondence between the lettersLetter (alphabet)
A letter is a grapheme in an alphabetic system of writing, such as the Greek alphabet and its descendants. Letters compose phonemes and each phoneme represents a phone in the spoken form of the language....
and the phonemes in the language, such as that of English
English orthography
English orthography is the alphabetic spelling system used by the English language. English orthography, like other alphabetic orthographies, uses a set of habits to represent speech sounds in writing. In most other languages, these habits are regular enough so that they may be called rules...
. Most languages of western Europe (which are written with the Latin alphabet
Latin alphabet
The Latin alphabet, also called the Roman alphabet, is the most recognized alphabet used in the world today. It evolved from a western variety of the Greek alphabet called the Cumaean alphabet, which was adopted and modified by the Etruscans who ruled early Rome...
), as well as the modern Greek language
Greek language
Greek is an independent branch of the Indo-European family of languages. Native to the southern Balkans, it has the longest documented history of any Indo-European language, spanning 34 centuries of written records. Its writing system has been the Greek alphabet for the majority of its history;...
to a lesser extent (written with the Greek alphabet
Greek alphabet
The Greek alphabet is the script that has been used to write the Greek language since at least 730 BC . The alphabet in its classical and modern form consists of 24 letters ordered in sequence from alpha to omega...
), have deep orthographies. In some of these, there are sounds with more than one possible spelling, usually for etymological
Etymology
Etymology is the study of the history of words, their origins, and how their form and meaning have changed over time.For languages with a long written history, etymologists make use of texts in these languages and texts about the languages to gather knowledge about how words were used during...
or morpho-phonemic reasons (like /dʒ/ in English, which can be written with ⟨j⟩, ⟨g⟩, ⟨dg⟩, ⟨dge⟩, or ⟨ge⟩). In other cases, there are not enough letters in the alphabet to represent all phonemes. The remaining ones must then be represented by using such devices as diacritic
Diacritic
A diacritic is a glyph added to a letter, or basic glyph. The term derives from the Greek διακριτικός . Diacritic is both an adjective and a noun, whereas diacritical is only an adjective. Some diacritical marks, such as the acute and grave are often called accents...
s, digraph
Digraph (orthography)
A digraph or digram is a pair of characters used to write one phoneme or a sequence of phonemes that does not correspond to the normal values of the two characters combined...
s that reuse letters with different values (like ⟨th⟩ in English, whose sound value is normally not /t/ + /h/), or simply inferred from the context (for example the short vowel
Vowel
In phonetics, a vowel is a sound in spoken language, such as English ah! or oh! , pronounced with an open vocal tract so that there is no build-up of air pressure at any point above the glottis. This contrasts with consonants, such as English sh! , where there is a constriction or closure at some...
s in abjad
Abjad
An abjad is a type of writing system in which each symbol always or usually stands for a consonant; the reader must supply the appropriate vowel....
s like the Arabic
Arabic alphabet
The Arabic alphabet or Arabic abjad is the Arabic script as it is codified for writing the Arabic language. It is written from right to left, in a cursive style, and includes 28 letters. Because letters usually stand for consonants, it is classified as an abjad.-Consonants:The Arabic alphabet has...
and Hebrew
Hebrew alphabet
The Hebrew alphabet , known variously by scholars as the Jewish script, square script, block script, or more historically, the Assyrian script, is used in the writing of the Hebrew language, as well as other Jewish languages, most notably Yiddish, Ladino, and Judeo-Arabic. There have been two...
alphabets, which are normally left unwritten). The syllabary
Syllabary
A syllabary is a set of written symbols that represent syllables, which make up words. In a syllabary, there is no systematic similarity between the symbols which represent syllables with the same consonant or vowel...
systems of Japanese
Japanese language
is a language spoken by over 130 million people in Japan and in Japanese emigrant communities. It is a member of the Japonic language family, which has a number of proposed relationships with other languages, none of which has gained wide acceptance among historical linguists .Japanese is an...
(hiragana
Hiragana
is a Japanese syllabary, one basic component of the Japanese writing system, along with katakana, kanji, and the Latin alphabet . Hiragana and katakana are both kana systems, in which each character represents one mora...
and katakana
Katakana
is a Japanese syllabary, one component of the Japanese writing system along with hiragana, kanji, and in some cases the Latin alphabet . The word katakana means "fragmentary kana", as the katakana scripts are derived from components of more complex kanji. Each kana represents one mora...
) are examples of almost perfectly shallow orthography – exceptions include the use ぢ and づ (discussed above) and the use of は, を, and へ to represent the sounds わ, お, and え, as relics of historical kana usage
Historical kana usage
The , or , refers to the in general use until orthographic reforms after World War II; the current orthography was adopted by Cabinet order in 1946. By that point the historical orthography was no longer in accord with Japanese pronunciation...
.
Another term to describe this characteristic is "defective orthography
Defective script
A defective alphabet is an alphabet that does not represent all the phonemic distinctions of a language. It is different from an irregular script, such as the English alphabet, which can distinguish all the phonemes of the language even if in practice it does not always do so.For example, Italian...
". This term, however, clearly implies the superiority of shallow orthographies—a point that advocates of morphophonemic writing would dispute. Using the terms "deep" and "shallow" is therefore more neutral in relation to the question of what types of orthography are superior.
Complex orthography
Complex orthographies often combine different types of scripts and/or utilize many different complex punctuation rules. Some widely accepted examples of languages with complex orthographies include ThaiThai alphabet
Thai script , is used to write the Thai language and other, minority, languages in Thailand. It has forty-four consonants , fifteen vowel symbols that combine into at least twenty-eight vowel forms, and four tone marks ....
, Chinese, Japanese
Japanese writing system
The modern Japanese writing system uses three main scripts:*Kanji, adopted Chinese characters*Kana, a pair of syllabaries , consisting of:...
, and Khmer
Khmer script
The Khmer script is an alphasyllabary script used to write the Khmer language . It is also used to write Pali among the Buddhist liturgy of Cambodia and Thailand....
.
See also
- English orthographyEnglish orthographyEnglish orthography is the alphabetic spelling system used by the English language. English orthography, like other alphabetic orthographies, uses a set of habits to represent speech sounds in writing. In most other languages, these habits are regular enough so that they may be called rules...
- Writing systemWriting systemA writing system is a symbolic system used to represent elements or statements expressible in language.-General properties:Writing systems are distinguished from other possible symbolic communication systems in that the reader must usually understand something of the associated spoken language to...
s: - Catherine McBride-ChangCatherine McBride-ChangCatherine McBride-Chang is a Professor at the Chinese University of Hong Kong, and her area of expertise is in developmental psychology specializing in the acquisition of early literacy skills...
Researcher in area of cross-cultural orthographic development - CursiveCursiveCursive, also known as joined-up writing, joint writing, or running writing, is any style of handwriting in which the symbols of the language are written in a simplified and/or flowing manner, generally for the purpose of making writing easier or faster...
- GraphologyGraphologyGraphology is the pseudoscientific study and analysis of handwriting, especially in relation to human psychology. In the medical field, it can be used to refer to the study of handwriting as an aid in diagnosis and tracking of diseases of the brain and nervous system...
- Keyboard layoutKeyboard layoutA keyboard layout is any specific mechanical, visual, or functional arrangement of the keys, legends, or key–meaning associations of a computer, typewriter, or other typographic keyboard....
- Lateral maskingLateral maskingLateral masking is a problem for the human visual perception of identical or similar entities in close proximity. This can be illustrated by the difficulty of counting the vertical bars of a barcode....
- LeetLeetLeet , also known as eleet or leetspeak, is an alternative alphabet for the English language that is used primarily on the Internet. It uses various combinations of ASCII characters to replace Latinate letters...
- PalaeographyPalaeographyPalaeography, also spelt paleography is the study of ancient writing. Included in the discipline is the practice of deciphering, reading, and dating historical manuscripts, and the cultural context of writing, including the methods with which writing and books were produced, and the history of...
- PenmanshipPenmanshipPenmanship is the technique of writing with the hand using a writing instrument. The various generic and formal historical styles of writing are called hands, whilst an individual personal style of penmanship is referred to as handwriting....
- Prescription and description
- RomanizationRomanizationIn linguistics, romanization or latinization is the representation of a written word or spoken speech with the Roman script, or a system for doing so, where the original word or language uses a different writing system . Methods of romanization include transliteration, for representing written...
- TypographyTypographyTypography is the art and technique of arranging type in order to make language visible. The arrangement of type involves the selection of typefaces, point size, line length, leading , adjusting the spaces between groups of letters and adjusting the space between pairs of letters...
- WritingWritingWriting is the representation of language in a textual medium through the use of a set of signs or symbols . It is distinguished from illustration, such as cave drawing and painting, and non-symbolic preservation of language via non-textual media, such as magnetic tape audio.Writing most likely...
External links
- The CODE and the Challenge of Learning to Read It
- Videos: The History and Impact of Writing in the West
- Omniglot – writing systems & languages of the world – a privately run orthography website
- Phonemic awareness page of the CTER wikiWikiA wiki is a website that allows the creation and editing of any number of interlinked web pages via a web browser using a simplified markup language or a WYSIWYG text editor. Wikis are typically powered by wiki software and are often used collaboratively by multiple users. Examples include...
- lonestar.texas.net/~jebbo/learn-as/ orthography of Old English