Khmer script
Encyclopedia
The Khmer script is an alphasyllabary script used to write the Khmer language
(the official language of Cambodia
). It is also used to write Pali
among the Buddhist liturgy of Cambodia and Thailand.
It was adapted from the Pallava
script, a variant of Grantha descended from the Brahmi
script of India
. The oldest dated inscription in Khmer was found at Angkor Borei in Takev Province south of Phnom Penh and dates from 611 AD. The modern Khmer script differs somewhat from precedent forms seen on the inscriptions of the ruins of Angkor
.
The consonants have subscript forms that are used to write consonant clusters. Also sometimes referred to as "sub-consonants", subscript consonant resemble the corresponding consonant symbol but in a miniscule form. In Khmer, they are known as , meaning the foot of a letter. Most subscript consonants are written directly below other consonants, although subscript is written before while a few others have ascending elements which appear after. Subscript consonants were previously used to write final consonants. This method of writing has ceased in modern written Khmer but is retained in the word .
* The subscript for the consonant is included in Unicode although its usage in modern Khmer is generally non-existent.
For some phonemes in loanwords, the Khmer writing system has 'created' supplementary consonants. Most of these consonants are created by stacking a subscript under the character for /hɑ/ to form digraphs
. The consonant for /pɑ/, however, is created by using the diacritical sign called over the consonant for /bɑ/. These additional consonants are mainly used to represent sounds in French and Thai loanwords.
. For most of the vowel symbols, there are two sounds (registers). The sound of the vowel used depends on the series (the inherent vowel) of the dominant consonant in a syllable cluster.
For technical reasons, the dependent vowels are seen here paired with the letter (KHMER LETTER QA in Unicode) as not all browsers will display them by themselves correctly.
. The period in the Khmer language "" resembles an eighth rest in music writing. Guillemets
are used for quotation.
with all dependent vowels that contain the symbol used for the vowel . A lot of these ligatures are easily recognizable, however a few may not be. One of the more unrecognizable is the ligature for the and which was created to differentiate it from the consonant symbol as well as the ligature for and . It is not always necessary to connect consonants with the dependent vowel .
Examples of ligatured symbols:
: léa (/liːə/) An example of the vowel forming a connection with the serif of a consonant.
: chba (/cɓaː/) Subscript consonants with ascending strokes above the baseline also form ligatures with the dependent vowel .
: msau (/msaw/) Another example of a subscript consonant forming a ligature. In this case, it is with the digraph dependent vowel . The digraph dependent vowel includes the cane-like stroke of the vowel .
: bau (/ɓaw/) The combination of the consonant and any vowels or digraph vowels based on the vowel is written with a stroke in the center of the ligature to give a distinction between the consonant .
: tra (/traː/) The subscript for is written precedent to the consonant it is pronounced after.
--27.109.115.73 (talk) 08:02, 21 November 2011 (UTC)Cambodia--27.109.115.73 (talk) 08:02, 21 November 2011 (UTC)
Standard in September, 1999 with the release of version 3.0.
Additional Khmer symbols were added to the Unicode
Standard in April, 2003 with the release of version 4.0.
The Unicode block for basic Khmer characters is U+1780–U+17FF. Grey areas indicate non-assigned code points:
The Unicode block for additional Khmer symbols is U+19E0–U+19FF:
Khmer language
Khmer , or Cambodian, is the language of the Khmer people and the official language of Cambodia. It is the second most widely spoken Austroasiatic language , with speakers in the tens of millions. Khmer has been considerably influenced by Sanskrit and Pali, especially in the royal and religious...
(the official language of Cambodia
Cambodia
Cambodia , officially known as the Kingdom of Cambodia, is a country located in the southern portion of the Indochina Peninsula in Southeast Asia...
). It is also used to write Pali
Páli
- External links :* *...
among the Buddhist liturgy of Cambodia and Thailand.
It was adapted from the Pallava
Pallava
The Pallava dynasty was a Tamil dynasty which ruled the northern Tamil Nadu region and the southern Andhra Pradesh region with their capital at Kanchipuram...
script, a variant of Grantha descended from the Brahmi
Brahmi
Brāhmī is the modern name given to the oldest members of the Brahmic family of scripts. The best-known Brāhmī inscriptions are the rock-cut edicts of Ashoka in north-central India, dated to the 3rd century BCE. These are traditionally considered to be early known examples of Brāhmī writing...
script of India
India
India , officially the Republic of India , is a country in South Asia. It is the seventh-largest country by geographical area, the second-most populous country with over 1.2 billion people, and the most populous democracy in the world...
. The oldest dated inscription in Khmer was found at Angkor Borei in Takev Province south of Phnom Penh and dates from 611 AD. The modern Khmer script differs somewhat from precedent forms seen on the inscriptions of the ruins of Angkor
Angkor
Angkor is a region of Cambodia that served as the seat of the Khmer Empire, which flourished from approximately the 9th to 15th centuries. The word Angkor is derived from the Sanskrit nagara , meaning "city"...
.
Orthography
Khmer is written from left to right with multiple levels of character stacking possible. Originally, there were 35 consonants, but only 33 are now in use for modern Khmer. The vowel system consists of independent vowels and dependent vowels. The dependent vowels have two registers of phonemes to account for the that fact that there are fewer vowel graphemes for the vowel phonemes in the spoken language. Khmer also uses diacritics that further enhance the pronunciation of words.Styles
Several styles of Khmer writing are used for varying purposes. The two main styles are (lit., slanted script) and (lit., round script).- refers to obliqueOblique typeOblique type is a form of type that slants slightly to the right, used in the same manner as italic type. Unlike italic type, however, it does not use different glyph shapes; it uses the same glyphs as roman type, except distorted...
letters. Entire bodies of text such as novels and other publications may be produced in . Unlike in written EnglishStandard Written EnglishStandard written English refers to the preferred form of English as it is written according to prescriptive authorities associated with publishing houses and schools. As there is no regulatory body for the English language, there is some disagreement about correct usage, though there is enough...
, oblique lettering does not represent any grammatical differences such as emphasisEmphasis (typography)In typography, emphasis is the exaggeration of words in a text with a font in a different style from the rest of the text—to emphasize them.- Methods and use :...
or quotation. Handwritten Khmer is often written in the oblique style.
- refers to upright or 'standing' letters, as opposed to oblique letters. Most modern Khmer typefaces are designed in this manner instead of being oblique, as text can be italicized by way of word processor commands and other computer applications to repsent the oblique manner of .
- is a style used in Pali palm-leaf manuscripts. It is characterized by sharper serifs and angles and retainment of some antique characteristics; notably in the consonant kâ . This style is also for yantra tattoos and yantras on cloth, paper, or engravings on brass plates in Cambodia as well as in Thailand. See also Khom scriptKhom scriptThere are two scripts in Southeast Asia called Khom script. This article describes the obscure script from Laos that Sidwell and Jacq have described under the name "Khom script"....
.
- is calligraphical style similar to as it also retains some characters reminiscent of antique Khmer script. Its name in Khmer, lit. 'round script', refers to the bold and thick lettering style. It is used for titles and headings in Cambodian documents, books, or currency, on shop signs or banners. It is sometimes used to emphasize royal names or other important nouns with the surrounding text in a different style.
Consonants
There are 35 Khmer consonant symbols, although modern Khmer only uses 33, two having become obsolete. Each consonant has an inherent vowel of /ɑ/ or /ɔ/. These inherent vowels are used to determine the pronunciation of the two registers of vowel phonemes represented by the diacritical vowels.The consonants have subscript forms that are used to write consonant clusters. Also sometimes referred to as "sub-consonants", subscript consonant resemble the corresponding consonant symbol but in a miniscule form. In Khmer, they are known as , meaning the foot of a letter. Most subscript consonants are written directly below other consonants, although subscript is written before while a few others have ascending elements which appear after. Subscript consonants were previously used to write final consonants. This method of writing has ceased in modern written Khmer but is retained in the word .
Consonants | Subscript form | UN romanization | IPA |
---|---|---|---|
kɑ | |||
kʰɑ | |||
kɔ | |||
kʰɔ | |||
ŋɔ | |||
cɑ | |||
cʰɑ | |||
cɔ | |||
cʰɔ | |||
ɲɔ | |||
ɗɑ | |||
tʰɑ | |||
ɗɔ | |||
tʰɔ | |||
nɑ | |||
tɑ | |||
tʰɑ | |||
tɔ | |||
tʰɔ | |||
nɔ | |||
ɓɑ | |||
pʰɑ | |||
pɔ | |||
pʰɔ | |||
mɔ | |||
jɔ | |||
rɔ | |||
lɔ | |||
ʋɔ | |||
- | |||
- | |||
sɑ | |||
hɑ | |||
* | lɑ | ||
ʔɑ |
* The subscript for the consonant is included in Unicode although its usage in modern Khmer is generally non-existent.
For some phonemes in loanwords, the Khmer writing system has 'created' supplementary consonants. Most of these consonants are created by stacking a subscript under the character for /hɑ/ to form digraphs
Digraph (orthography)
A digraph or digram is a pair of characters used to write one phoneme or a sequence of phonemes that does not correspond to the normal values of the two characters combined...
. The consonant for /pɑ/, however, is created by using the diacritical sign called over the consonant for /bɑ/. These additional consonants are mainly used to represent sounds in French and Thai loanwords.
Digraph consonants | UN romanization | IPA |
---|---|---|
ɡɑ | ||
ɡɔ | ||
nɑ | ||
pɑ | ||
mɑ | ||
lɑ | ||
fɑ, wɑ | ||
fɔ, wɔ | ||
ʒɑ, zɑ | ||
ʒɔ, zɔ |
Dependent vowels
The Khmer script uses dependent vowels, or diacritical vowels, to modify the inherent vowels of consonants. Dependent vowels are known in Khmer as or . Dependent vowels must always be combined with a consonant in orthographyOrthography
The orthography of a language specifies a standardized way of using a specific writing system to write the language. Where more than one writing system is used for a language, for example Kurdish, Uyghur, Serbian or Inuktitut, there can be more than one orthography...
. For most of the vowel symbols, there are two sounds (registers). The sound of the vowel used depends on the series (the inherent vowel) of the dominant consonant in a syllable cluster.
Dependent vowels |
Un romanization | IPA | ||
---|---|---|---|---|
a-series | o-series | a-series | o-series | |
aː | iːə | |||
e | ɨ | |||
əj | iː | |||
ə | ɨ | |||
əːɨ | ɨː | |||
o | u | |||
oːu | uː | |||
uːə | ||||
aːə | əː | |||
ɨːə | ||||
iːə | ||||
eːi | eː | |||
aːe | ɛː | |||
aj | ɨj | |||
aːo | oː | |||
aw | ɨw |
Diacritics | UN romanization | IPA | ||
---|---|---|---|---|
a-series | o-series | a-series | o-series | |
om | um | |||
ɑm | um | |||
am | oəm | |||
aŋ | eəŋ | |||
aʰ | eəʰ | |||
oʰ | uʰ | |||
eiʰ | eʰ | |||
ɑʰ | ʊəʰ | |||
aʔ | eəʔ |
For technical reasons, the dependent vowels are seen here paired with the letter (KHMER LETTER QA in Unicode) as not all browsers will display them by themselves correctly.
Independent vowels
Independent vowels are non-diacritical characters used to represent vowel phonemes occurring at the beginning of syllables. In Khmer they are called which means complete vowels. Independent vowels |
UN romanization | IPA |
---|---|---|
ʔɑʔ | ||
ʔa | ||
ʔe | ||
ʔəj | ||
ʔ | ||
ឨ | ||
ʔu | ||
ʔɨw | ||
ʔrɨ | ||
ʔrɨː | ||
ʔlɨ | ||
ʔlɨː | ||
ʔeː | ||
ʔaj | ||
, | ʔaːo | |
ʔaw |
Diacritics
Diacritics | Name | Notes |
---|---|---|
; nasalizes the inherent vowels and some of the dependent vowels, see anusvara Anusvara Anusvara is the diacritic used to mark a type of nasalization used in a number of Indic languages. Depending on the location of the anusvara in the word and the language within which it is used, its exact pronunciation can vary greatly.... , sometimes used to represent [aɲ] in Sanskrit loanwords |
||
"shining face"; adds final aspiration to dependent or inherent vowels, usually omitted, corresponds to the visarga Visarga Visarga is a Sanskrit word meaning "sending forth, discharge". In Sanskrit phonology , is the name of a phone, , written as IAST , Harvard-Kyoto , Devanagari . Visarga is an allophone of and in pausa... diacritic, it maybe included as dependent vowel symbol |
||
("pair of dots"); adds final glottalness Glottal stop The glottal stop, or more fully, the voiceless glottal plosive, is a type of consonantal sound used in many spoken languages. In English, the feature is represented, for example, by the hyphen in uh-oh! and by the apostrophe or [[ʻokina]] in Hawaii among those using a preservative pronunciation of... to dependent or inherent vowels, usually omitted |
||
("mouse teeth"); used to convert some o-series consonants to the a-series | ||
; used to convert some a-series consonants to the o-series | ||
also known as ; used in place when the diacritics and impede with superscript vowels | ||
used to shorten some vowels | ||
|
; behave similarly to the , corresponds to the Devanagari Devanagari Devanagari |deva]]" and "nāgarī" ), also called Nagari , is an abugida alphabet of India and Nepal... diacritic , however it lost its original function which was to represent a vocalic r |
|
; used to render some letters as unpronounced | ||
("crow's foot"); more a punctuation mark than a diacritic; used in writing to indicate the rising intonation of an exclamation or interjection; often placed on particles such as /na/, /nɑː/, /nɛː/, /vəːj/, and the feminine response /cah/ | ||
denotes stressed intonation in some single-consonant words | ||
represents a short inherent vowel in Sanskrit and Pali words; usually omitted | ||
a mostly obsolete diacritic, corresponds to the virāma Virama Virama is a generic term for the diacritic in many Brahmic scripts, including Devanagari and East Nagari, that is used to suppress the inherent vowel that otherwise occurs with every consonant letter. The name is Sanskrit for "cessation, termination, end"... |
||
a.w. coeng; a sign developed for Unicode to input subscript consonants, appearance of this sign varies among fonts |
Punctuation marks
The Khmer script uses several unique punctuation marks as well as some borrowed from the Latin script such as the question markQuestion mark
The question mark , is a punctuation mark that replaces the full stop at the end of an interrogative sentence in English and many other languages. The question mark is not used for indirect questions...
. The period in the Khmer language "" resembles an eighth rest in music writing. Guillemets
Guillemets
Guillemets , also called angle quotes, are line segments, pointed as if arrows , sometimes forming a complementary set of punctuation marks used as a form of quotation mark....
are used for quotation.
Ligatures
Most consonants, including a few of the subscripts, form ligaturesLigature (typography)
In writing and typography, a ligature occurs where two or more graphemes are joined as a single glyph. Ligatures usually replace consecutive characters sharing common components and are part of a more general class of glyphs called "contextual forms", where the specific shape of a letter depends on...
with all dependent vowels that contain the symbol used for the vowel . A lot of these ligatures are easily recognizable, however a few may not be. One of the more unrecognizable is the ligature for the and which was created to differentiate it from the consonant symbol as well as the ligature for and . It is not always necessary to connect consonants with the dependent vowel .
Examples of ligatured symbols:
: léa (/liːə/) An example of the vowel forming a connection with the serif of a consonant.
: chba (/cɓaː/) Subscript consonants with ascending strokes above the baseline also form ligatures with the dependent vowel .
: msau (/msaw/) Another example of a subscript consonant forming a ligature. In this case, it is with the digraph dependent vowel . The digraph dependent vowel includes the cane-like stroke of the vowel .
: bau (/ɓaw/) The combination of the consonant and any vowels or digraph vowels based on the vowel is written with a stroke in the center of the ligature to give a distinction between the consonant .
: tra (/traː/) The subscript for is written precedent to the consonant it is pronounced after.
Numerals
The numerals of the Khmer script, similar to that used by other civilizations in Southeast Asia, are also derived from the southern Indian script. Arabic numerals are also used, but to a lesser extent.Khmer numerals | ||||||||||
---|---|---|---|---|---|---|---|---|---|---|
Arabic numerals | 0 | 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9 |
--27.109.115.73 (talk) 08:02, 21 November 2011 (UTC)Cambodia--27.109.115.73 (talk) 08:02, 21 November 2011 (UTC)
Unicode
Khmer was added to the UnicodeUnicode
Unicode is a computing industry standard for the consistent encoding, representation and handling of text expressed in most of the world's writing systems...
Standard in September, 1999 with the release of version 3.0.
Additional Khmer symbols were added to the Unicode
Unicode
Unicode is a computing industry standard for the consistent encoding, representation and handling of text expressed in most of the world's writing systems...
Standard in April, 2003 with the release of version 4.0.
The Unicode block for basic Khmer characters is U+1780–U+17FF. Grey areas indicate non-assigned code points:
The Unicode block for additional Khmer symbols is U+19E0–U+19FF:
External links
- FAQ and Resources on Khmer in Unicode
- Enabling Khmer Unicode
- Khmer Unicode in some mobile phones
- Khmer Alphabet Chart with Audio
- How to Install Khmer Unicode on your Windows 7 Computer
- How to Install Khmer Unicode on your Windows XP Computer
- Omniglot entry on Khmer
- Geonames Khmer Alphabet Chart
- Khmer Romanization Table (PDF)
- Evolution of the Khmer script
- Authentic Khmer Online (common phrases in Khmer script with audio file examples)
- Khmer wordlist sortet frequenzy
- CBC radio documentary referring to development of keyboard for Khmer script
- A small Primer on the Khmer Language