Tamil script
Encyclopedia
The Tamil script is a script that is used to write the Tamil language
as well as other minority languages such as Badaga
, Irulas
, and Paniya
. With the use of diacritic
s to represent aspirated
and voiced
consonants not represented in the basic script, it is also used to write Saurashtra
and, by Tamils
, to write Sanskrit
.
s ( "soul-letters"), eighteen consonant
s ( "body-letters") and one character, the , which is classified in Tamil grammar as being neither a consonant nor a vowel ( "the hermaphrodite
letter"), though often considered as part of the vowel set ( "vowel class"). The script, however, is syllabic
and not alphabet
ic. The complete script, therefore, consists of the thirty-one letters in their independent form, and an additional 216 combinant letters representing a total 247 combinations ( ) of a consonant and a vowel, a mute consonant, or a vowel alone. These combinant letters are formed by adding a vowel marker to the consonant. Some vowels require the basic shape of the consonant to be altered in a way that is specific to that vowel. Others are written by adding a vowel-specific suffix to the consonant, yet others a prefix, and finally some vowels require adding both a prefix and a suffix to the consonant. In every case the vowel marker is different from the standalone character for the vowel.
The Tamil script is written from left to right.
. The earliest inscriptions which are accepted examples of Tamil writing date to a time just after the Asokan period. The script used by these inscriptions is commonly known as the Tamil Brahmi or Tamili script, and differs in many ways from standard Asokan Brahmi. For example, early Tamil Brahmi, unlike Asokan Brahmi, had a system to distinguish between pure consonant
s (m in this example) and consonants with an inherent vowel
(ma in this example). In addition, early Tamil Brahmi used slightly different vowel markers, had extra characters to represent letters not found in Sanskrit
, and omitted letters for sounds not present in Tamil, such as voiced consonants and aspirates. Inscriptions from the second century AD use a later form of the Tamil Brahmi
script, which is substantially similar to the writing system described in the Tolkappiyam
, an ancient Tamil grammar. Most notably, they use the to suppress the inherent vowel. The Tamil letters thereafter evolved towards a more rounded form, and by the fifth or sixth century AD had reached a form called the early .
The modern Tamil script does not, however, descend from this script. In 7th century, the Pallava
dynasty created a new script for Tamil, which was formed by simplifying the Grantha script (which in turn derived from Southern Brahmi), and adding to it the vatteluttu letters for sounds not found in Sanskrit. By the 8th century, this new script supplanted vatteluttu in the Chola
and Pallava kingdoms which lay in the north portion of the Tamil-speaking region. Vatteluttu continued to be used in the southern portion of the Tamil-speaking region, in the Chera
and Pandyan
kingdoms until the 11th century, when the Pandyan kingdom was conquered by the Cholas.
Over the next few centuries, the Chola-Pallava script evolved into the modern Tamil script. The use of palm leaves as the primary medium for writing
led to changes in the script. The scribe had to be careful not to pierce the leaves with the stylus while writing, because a leaf with a hole was more likely to tear and decay faster. As a result, the use of the to distinguish pure consonants became rare, with pure consonants usually being written as if the inherent vowel were present. Similarly, the vowel marker for the , a half-rounded u which occurs at the end of some words and in the medial position in certain compound words, also fell out of use and was replaced by the marker for the simple u. The did not fully reappear until the introduction of printing
, but the marker never came back into use, although the sound itself still exists and plays an important role in Tamil prosody
.
The forms of some of the letters were simplified in the nineteenth century to make the script easier to typeset. In the twentieth century, the script was simplified even further in a series of reforms, which regularised the vowel markers used with consonants by eliminating special markers and most irregular forms.
and its voiced equivalent. Thus the character க் k, for example, represents both [k], and [ɡ]. This is because Tamil grammar
treats only unvoiced stops as being "true" consonants, treating voiced and aspirated sounds as euphonic
variants of unvoiced sounds. Traditional Tamil grammars contain detailed rules, observed in formal speech, for when a stop is to be pronounced with and without voice. These rules are not followed in colloquial or dialectal speech, where voiced and unvoiced versions of a stop are, in effect, allophone
s, being used in specific phonetic contexts, without serving to distinguish words.
Also unlike other Indic scripts, the Tamil script rarely uses special consonantal ligatures to represent conjunct consonants, which are far less frequent in Tamil than in other Indian languages. Conjunct consonants, where they occur are written by writing the character for the first consonant, adding the to suppress its inherent vowel, and then writing the character for the second consonant. There are a few exceptions, namely and śrī.
), and idayinam (medium consonants).
There are some lexical rules for formation of words. Tolkāppiyam
describes such rules. Some examples: a word cannot end in certain consonants, and cannot begin with some consonants including 'r' 'l' and 'll'; there are two consonants for the dental 'n' - which one should be used depends on whether the 'n' occurs at the start of the word and on the letters around it. (Historically, one 'n' was pronounced alveolarly, as is still true in Malayalam
.)
The order of the alphabet (strictly abugida
) in Tamil closely matches that of the linguistically unrelated Indo-Aryan languages, reflecting the common origin of their scripts from Brahmi.
The letter is used only for words borrowed from Sanskrit (eg. ஶாரதா śāradā) , but is included in Unicode for rendering the common ligature 'Sri' (ஸ்ரீ Śrī), which is made up of . is technically a ligature of .
In recent times, three combinations of Tamil basic letters are generally used to depict sounds of English letters 'f', 'z', and 'x'. This is for writing English and Arabic names and words in Tamil. The combinations are ஃப for f, ஃஜ for z and ஃஸ் for x. For example: asif = அசிஃப், aZaarudheen = அஃஜாருதீன், rex = ரெஃஸ்.
) letters that are called 'living' letters (uyirmei, i.e. letters that have both 'body' and 'soul').
Tamil vowels are divided into short and long (five of each type) and two diphthong
s.
The special letter is rarely used by itself. It normally serves a purely grammatical function as the independent vowel form of the dot on consonants that suppresses the inherent 'a' sound in plain consonants. However, in modern times it has come to be used to represent foreign sounds - for example is used to represent the English sound 'F', not found in Tamil.As another speciality of Tamil, க alone is used for ( ka, kha, ga, gha = [k], [ɡ], [x], [ɣ], [h]) unlike other languages which feature different letters for different pronunciation. Other letters which have the characteristics like க are ச ([t͡ʃ], [d͡ʒ], [ʃ], [s], [ʒ), ட ([ʈ], [ɖ], [ɽ]), த ([t̪], [d̪], [ð]), ப ([p], [b], [β]) , ற் ([r], [t], [d]) (see the first table of this section).
The long (nedil) vowels are about twice as long as the short (kuRil) vowels. The diphthong
s are usually pronounced about one and a half times as long as the short vowels, though some grammatical texts place them with the long (nedil) vowels.
As can be seen in the compound form, the vowel sign can be added to the right, left or both sides of the consonants. It can also form a ligature
. These rules are evolving and older use has more ligatures than modern use. What you actually see on this page depends on your font selection; for example, Code2000
will show more ligatures than Latha.
There are proponents of script reform who want to eliminate all ligatures and let all vowel signs appear on the right side.
Unicode encodes the character in logical order (always the consonant first), whereas legacy 8-bit encodings (such as TSCII
) prefer the written order. This makes it necessary to reorder when converting from one encoding to another; it is not sufficient simply to map one set of codepoints to the other.
range for Tamil is U+0B80–U+0BFF. Grey areas indicate non-assigned code points. Most of the non-assigned codepoints are designated reserved because they are in the same relative position as characters assigned in other South Asian script blocks that correspond to phonemes that don't exist in the Tamil script.
Like other South Asian scripts in Unicode, the Tamil encoding was originally derived from the ISCII
standard. Both ISCII
and Unicode
encode Tamil as an abugida
. In an abugida, each basic character represents a consonant and default vowel. Consonants with a different vowel or bare consonants are represented by adding a modifier character to a base character. Each codepoint representing a similar phoneme is encoded in the same relative position in each South Asian script block in Unicode, including Tamil. Although Unicode represents Tamil as an abugida all the pure consonants (consonants with no associated vowel) and syllables in Tamil can be represented by combining multiple Unicode codepoints, as can be seen in the Unicode Tamil Syllabary below.
In Unicode 5.1, named sequences were added for all Tamil pure consonants and syllables. Unicode 5.1 also has a named sequence for the Tamil ligature SRI (śrī), ஶ்ரீ . The name of this sequence is TAMIL SYLLABLE SHRII, and is composed of the Unicode sequence U+0BB6 U+0BCD U+0BB0 U+0BC0.
Tamil language
Tamil is a Dravidian language spoken predominantly by Tamil people of the Indian subcontinent. It has official status in the Indian state of Tamil Nadu and in the Indian union territory of Pondicherry. Tamil is also an official language of Sri Lanka and Singapore...
as well as other minority languages such as Badaga
Badaga language
The Badaga language is a southern Dravidian language spoken by approximately 400,000 people in the Nilgiri Hills in Southern India. It is known for its retroflex vowels. The word Badaga refers to the Badaga language as well as the Badaga community/tribe...
, Irulas
Irulas
Irulas are a scheduled tribe of India. Irulas are present in various parts of India, but are mainly located in the Thiruvallur district of Tamil Nadu...
, and Paniya
Paniya
Paniya is one of the languages of India. It is a language of the scheduled tribes with a majority of its speakers in the state of Kerala. It is also called as Pania, Paniyan and Panyah. It belongs to the Dravidian family of languages. According to the 1981 Census, there were 63,827 speakers of...
. With the use of diacritic
Diacritic
A diacritic is a glyph added to a letter, or basic glyph. The term derives from the Greek διακριτικός . Diacritic is both an adjective and a noun, whereas diacritical is only an adjective. Some diacritical marks, such as the acute and grave are often called accents...
s to represent aspirated
Aspiration (phonetics)
In phonetics, aspiration is the strong burst of air that accompanies either the release or, in the case of preaspiration, the closure of some obstruents. To feel or see the difference between aspirated and unaspirated sounds, one can put a hand or a lit candle in front of one's mouth, and say pin ...
and voiced
Voice (phonetics)
Voice or voicing is a term used in phonetics and phonology to characterize speech sounds, with sounds described as either voiceless or voiced. The term, however, is used to refer to two separate concepts. Voicing can refer to the articulatory process in which the vocal cords vibrate...
consonants not represented in the basic script, it is also used to write Saurashtra
Saurashtra language
Sourashtra or "Sourashtras" or ꢱꣃꢬꢵꢰ꣄ꢜ꣄ꢬꢵ refers to a community of people who had their original homes in Gujarat and presently settled almost in all major Towns of Tamil Nadu and are concentrated more in Madurai which is considered as their cultural Headquarters.They have also settled in...
and, by Tamils
Tamil people
Tamil people , also called Tamils or Tamilians, are an ethnic group native to Tamil Nadu, India and the north-eastern region of Sri Lanka. Historic and post 15th century emigrant communities are also found across the world, notably Malaysia, Singapore, Mauritius, South Africa, Australia, Canada,...
, to write Sanskrit
Sanskrit
Sanskrit , is a historical Indo-Aryan language and the primary liturgical language of Hinduism, Jainism and Buddhism.Buddhism: besides Pali, see Buddhist Hybrid Sanskrit Today, it is listed as one of the 22 scheduled languages of India and is an official language of the state of Uttarakhand...
.
Characteristics
The Tamil script has twelve vowelVowel
In phonetics, a vowel is a sound in spoken language, such as English ah! or oh! , pronounced with an open vocal tract so that there is no build-up of air pressure at any point above the glottis. This contrasts with consonants, such as English sh! , where there is a constriction or closure at some...
s ( "soul-letters"), eighteen consonant
Consonant
In articulatory phonetics, a consonant is a speech sound that is articulated with complete or partial closure of the vocal tract. Examples are , pronounced with the lips; , pronounced with the front of the tongue; , pronounced with the back of the tongue; , pronounced in the throat; and ,...
s ( "body-letters") and one character, the , which is classified in Tamil grammar as being neither a consonant nor a vowel ( "the hermaphrodite
Hermaphrodite
In biology, a hermaphrodite is an organism that has reproductive organs normally associated with both male and female sexes.Many taxonomic groups of animals do not have separate sexes. In these groups, hermaphroditism is a normal condition, enabling a form of sexual reproduction in which both...
letter"), though often considered as part of the vowel set ( "vowel class"). The script, however, is syllabic
Syllable
A syllable is a unit of organization for a sequence of speech sounds. For example, the word water is composed of two syllables: wa and ter. A syllable is typically made up of a syllable nucleus with optional initial and final margins .Syllables are often considered the phonological "building...
and not alphabet
Alphabet
An alphabet is a standard set of letters—basic written symbols or graphemes—each of which represents a phoneme in a spoken language, either as it exists now or as it was in the past. There are other systems, such as logographies, in which each character represents a word, morpheme, or semantic...
ic. The complete script, therefore, consists of the thirty-one letters in their independent form, and an additional 216 combinant letters representing a total 247 combinations ( ) of a consonant and a vowel, a mute consonant, or a vowel alone. These combinant letters are formed by adding a vowel marker to the consonant. Some vowels require the basic shape of the consonant to be altered in a way that is specific to that vowel. Others are written by adding a vowel-specific suffix to the consonant, yet others a prefix, and finally some vowels require adding both a prefix and a suffix to the consonant. In every case the vowel marker is different from the standalone character for the vowel.
The Tamil script is written from left to right.
History
The Tamil script, like the other Indic scripts, is thought to have evolved from the Brahmi scriptBrāhmī script
Brāhmī is the modern name given to the oldest members of the Brahmic family of scripts. The best-known Brāhmī inscriptions are the rock-cut edicts of Ashoka in north-central India, dated to the 3rd century BCE. These are traditionally considered to be early known examples of Brāhmī writing...
. The earliest inscriptions which are accepted examples of Tamil writing date to a time just after the Asokan period. The script used by these inscriptions is commonly known as the Tamil Brahmi or Tamili script, and differs in many ways from standard Asokan Brahmi. For example, early Tamil Brahmi, unlike Asokan Brahmi, had a system to distinguish between pure consonant
Consonant
In articulatory phonetics, a consonant is a speech sound that is articulated with complete or partial closure of the vocal tract. Examples are , pronounced with the lips; , pronounced with the front of the tongue; , pronounced with the back of the tongue; , pronounced in the throat; and ,...
s (m in this example) and consonants with an inherent vowel
Vowel
In phonetics, a vowel is a sound in spoken language, such as English ah! or oh! , pronounced with an open vocal tract so that there is no build-up of air pressure at any point above the glottis. This contrasts with consonants, such as English sh! , where there is a constriction or closure at some...
(ma in this example). In addition, early Tamil Brahmi used slightly different vowel markers, had extra characters to represent letters not found in Sanskrit
Sanskrit
Sanskrit , is a historical Indo-Aryan language and the primary liturgical language of Hinduism, Jainism and Buddhism.Buddhism: besides Pali, see Buddhist Hybrid Sanskrit Today, it is listed as one of the 22 scheduled languages of India and is an official language of the state of Uttarakhand...
, and omitted letters for sounds not present in Tamil, such as voiced consonants and aspirates. Inscriptions from the second century AD use a later form of the Tamil Brahmi
Tamil-Brahmi
Tamil-Brahmi, or Damili is an early phonetic script used to write Tamil characters. It is a variant of many Brahmi scripts used throughout South Asia, namely Ashokan Brahmi, Southern Brahmi, Bhattiprolu script and the Sri Lankan based Sinhala-Brahmi. It is known from surviving inscribed cave beds,...
script, which is substantially similar to the writing system described in the Tolkappiyam
Tolkappiyam
The Tolkāppiyam is a work on the grammar of the Tamil language and the earliest extant work of Tamil literature. It is written in the form of noorpaa or short formulaic compositions and comprises three books - the Ezhuttadikaram, the Solladikaram and the Poruladikaram. Each of these books is...
, an ancient Tamil grammar. Most notably, they use the to suppress the inherent vowel. The Tamil letters thereafter evolved towards a more rounded form, and by the fifth or sixth century AD had reached a form called the early .
The modern Tamil script does not, however, descend from this script. In 7th century, the Pallava
Pallava
The Pallava dynasty was a Tamil dynasty which ruled the northern Tamil Nadu region and the southern Andhra Pradesh region with their capital at Kanchipuram...
dynasty created a new script for Tamil, which was formed by simplifying the Grantha script (which in turn derived from Southern Brahmi), and adding to it the vatteluttu letters for sounds not found in Sanskrit. By the 8th century, this new script supplanted vatteluttu in the Chola
Chola Dynasty
The Chola dynasty was a Tamil dynasty which was one of the longest-ruling in some parts of southern India. The earliest datable references to this Tamil dynasty are in inscriptions from the 3rd century BC left by Asoka, of Maurya Empire; the dynasty continued to govern over varying territory until...
and Pallava kingdoms which lay in the north portion of the Tamil-speaking region. Vatteluttu continued to be used in the southern portion of the Tamil-speaking region, in the Chera
Chera dynasty
Chera Dynasty in South India is one of the most ancient ruling dynasties in India. Together with the Cholas and the Pandyas, they formed the three principle warring Iron Age Tamil kingdoms in southern India...
and Pandyan
Pandyan Kingdom
The Pandyan dynasty was an ancient Tamil dynasty. The Pandyas were one of the four Tamil dynasties , which ruled South India until the 15th century CE. They initially ruled their country Pandya Nadu from Korkai, a seaport on the Southernmost tip of the Indian Peninsula, and in later times moved...
kingdoms until the 11th century, when the Pandyan kingdom was conquered by the Cholas.
Over the next few centuries, the Chola-Pallava script evolved into the modern Tamil script. The use of palm leaves as the primary medium for writing
Writing
Writing is the representation of language in a textual medium through the use of a set of signs or symbols . It is distinguished from illustration, such as cave drawing and painting, and non-symbolic preservation of language via non-textual media, such as magnetic tape audio.Writing most likely...
led to changes in the script. The scribe had to be careful not to pierce the leaves with the stylus while writing, because a leaf with a hole was more likely to tear and decay faster. As a result, the use of the to distinguish pure consonants became rare, with pure consonants usually being written as if the inherent vowel were present. Similarly, the vowel marker for the , a half-rounded u which occurs at the end of some words and in the medial position in certain compound words, also fell out of use and was replaced by the marker for the simple u. The did not fully reappear until the introduction of printing
Printing
Printing is a process for reproducing text and image, typically with ink on paper using a printing press. It is often carried out as a large-scale industrial process, and is an essential part of publishing and transaction printing....
, but the marker never came back into use, although the sound itself still exists and plays an important role in Tamil prosody
Prosody (linguistics)
In linguistics, prosody is the rhythm, stress, and intonation of speech. Prosody may reflect various features of the speaker or the utterance: the emotional state of the speaker; the form of the utterance ; the presence of irony or sarcasm; emphasis, contrast, and focus; or other elements of...
.
The forms of some of the letters were simplified in the nineteenth century to make the script easier to typeset. In the twentieth century, the script was simplified even further in a series of reforms, which regularised the vowel markers used with consonants by eliminating special markers and most irregular forms.
Relationship with other Indic scripts
The Tamil script differs from other Brahmi-derived scripts in a number of ways. Unlike every other Indic script, it uses the same character to represent both an unvoiced stopStop consonant
In phonetics, a plosive, also known as an occlusive or an oral stop, is a stop consonant in which the vocal tract is blocked so that all airflow ceases. The occlusion may be done with the tongue , lips , and &...
and its voiced equivalent. Thus the character க் k, for example, represents both [k], and [ɡ]. This is because Tamil grammar
Tamil grammar
Much of Tamil grammar is extensively described in the oldest available grammar book for Tamil, the Tolkāppiyam. Modern Tamil writing is largely based on the 13th century grammar which restated and clarified the rules of the Tolkāppiyam, with some modifications.-Parts of Tamil grammar:Traditional...
treats only unvoiced stops as being "true" consonants, treating voiced and aspirated sounds as euphonic
Euphony
Phonaesthetics is the claim or study of inherent pleasantness or beauty or unpleasantness of the sound of certain words and sentences. Poetry is considered euphonic, as is well-crafted literary prose...
variants of unvoiced sounds. Traditional Tamil grammars contain detailed rules, observed in formal speech, for when a stop is to be pronounced with and without voice. These rules are not followed in colloquial or dialectal speech, where voiced and unvoiced versions of a stop are, in effect, allophone
Allophone
In phonology, an allophone is one of a set of multiple possible spoken sounds used to pronounce a single phoneme. For example, and are allophones for the phoneme in the English language...
s, being used in specific phonetic contexts, without serving to distinguish words.
Also unlike other Indic scripts, the Tamil script rarely uses special consonantal ligatures to represent conjunct consonants, which are far less frequent in Tamil than in other Indian languages. Conjunct consonants, where they occur are written by writing the character for the first consonant, adding the to suppress its inherent vowel, and then writing the character for the second consonant. There are a few exceptions, namely and śrī.
Basic consonants
Consonants are called the 'body' (mei) letters. The consonants are classified into three categories: vallinam (hard consonants), mellinam (soft consonants, including all nasalsNasal consonant
A nasal consonant is a type of consonant produced with a lowered velum in the mouth, allowing air to escape freely through the nose. Examples of nasal consonants in English are and , in words such as nose and mouth.- Definition :...
), and idayinam (medium consonants).
There are some lexical rules for formation of words. Tolkāppiyam
Tolkappiyam
The Tolkāppiyam is a work on the grammar of the Tamil language and the earliest extant work of Tamil literature. It is written in the form of noorpaa or short formulaic compositions and comprises three books - the Ezhuttadikaram, the Solladikaram and the Poruladikaram. Each of these books is...
describes such rules. Some examples: a word cannot end in certain consonants, and cannot begin with some consonants including 'r' 'l' and 'll'; there are two consonants for the dental 'n' - which one should be used depends on whether the 'n' occurs at the start of the word and on the letters around it. (Historically, one 'n' was pronounced alveolarly, as is still true in Malayalam
Malayalam language
Malayalam , is one of the four major Dravidian languages of southern India. It is one of the 22 scheduled languages of India with official language status in the state of Kerala and the union territories of Lakshadweep and Pondicherry. It is spoken by 35.9 million people...
.)
The order of the alphabet (strictly abugida
Abugida
An abugida , also called an alphasyllabary, is a segmental writing system in which consonant–vowel sequences are written as a unit: each unit is based on a consonant letter, and vowel notation is obligatory but secondary...
) in Tamil closely matches that of the linguistically unrelated Indo-Aryan languages, reflecting the common origin of their scripts from Brahmi.
Consonants of Modern Tamil
The Tamil speech has incorporated many phonemes which were not part of Tholkappiyan classification. These alphabetical characters called "grantha" are part of modern Tamil. These are part of accepted Tamil alphabets nowadays taught from elementary school and incorporated in Tamilnadu Government encoding called TACE 16 (Tamil All Character Encoding)Consonant | ISO 15919 ISO 15919 ISO 15919 Transliteration of Devanagari and related Indic scripts into Latin characters is an international standard for the transliteration of Indic scripts to the Latin alphabet formed in 2001... | IPA |
---|---|---|
[ɕ], [ʃ] | ||
[d͡ʒ] | ||
[ʂ] | ||
[s] | ||
[h] | ||
[kʂ] |
The letter is used only for words borrowed from Sanskrit (eg. ஶாரதா śāradā) , but is included in Unicode for rendering the common ligature 'Sri' (ஸ்ரீ Śrī), which is made up of . is technically a ligature of .
In recent times, three combinations of Tamil basic letters are generally used to depict sounds of English letters 'f', 'z', and 'x'. This is for writing English and Arabic names and words in Tamil. The combinations are ஃப for f, ஃஜ for z and ஃஸ் for x. For example: asif = அசிஃப், aZaarudheen = அஃஜாருதீன், rex = ரெஃஸ்.
Vowels
Vowels are also called the 'life' (uyir) or 'soul' letters. Together with the consonants (which are called 'body' letters), they form compound, syllabic (abugidaAbugida
An abugida , also called an alphasyllabary, is a segmental writing system in which consonant–vowel sequences are written as a unit: each unit is based on a consonant letter, and vowel notation is obligatory but secondary...
) letters that are called 'living' letters (uyirmei, i.e. letters that have both 'body' and 'soul').
Tamil vowels are divided into short and long (five of each type) and two diphthong
Diphthong
A diphthong , also known as a gliding vowel, refers to two adjacent vowel sounds occurring within the same syllable. Technically, a diphthong is a vowel with two different targets: That is, the tongue moves during the pronunciation of the vowel...
s.
Compound form
Using the consonant 'k' as an example:The special letter is rarely used by itself. It normally serves a purely grammatical function as the independent vowel form of the dot on consonants that suppresses the inherent 'a' sound in plain consonants. However, in modern times it has come to be used to represent foreign sounds - for example is used to represent the English sound 'F', not found in Tamil.As another speciality of Tamil, க alone is used for ( ka, kha, ga, gha = [k], [ɡ], [x], [ɣ], [h]) unlike other languages which feature different letters for different pronunciation. Other letters which have the characteristics like க are ச ([t͡ʃ], [d͡ʒ], [ʃ], [s], [ʒ), ட ([ʈ], [ɖ], [ɽ]), த ([t̪], [d̪], [ð]), ப ([p], [b], [β]) , ற் ([r], [t], [d]) (see the first table of this section).
The long (nedil) vowels are about twice as long as the short (kuRil) vowels. The diphthong
Diphthong
A diphthong , also known as a gliding vowel, refers to two adjacent vowel sounds occurring within the same syllable. Technically, a diphthong is a vowel with two different targets: That is, the tongue moves during the pronunciation of the vowel...
s are usually pronounced about one and a half times as long as the short vowels, though some grammatical texts place them with the long (nedil) vowels.
As can be seen in the compound form, the vowel sign can be added to the right, left or both sides of the consonants. It can also form a ligature
Ligature (typography)
In writing and typography, a ligature occurs where two or more graphemes are joined as a single glyph. Ligatures usually replace consecutive characters sharing common components and are part of a more general class of glyphs called "contextual forms", where the specific shape of a letter depends on...
. These rules are evolving and older use has more ligatures than modern use. What you actually see on this page depends on your font selection; for example, Code2000
Code2000
Code2000 is a pan-Unicode digital font, which includes characters and symbols from a very large range of writing systems. As of the current final version 1.171 released in 2008, Code2000 is designed and implemented by James Kass to include as much of the Unicode 5.2 standard as practical , and to...
will show more ligatures than Latha.
There are proponents of script reform who want to eliminate all ligatures and let all vowel signs appear on the right side.
Unicode encodes the character in logical order (always the consonant first), whereas legacy 8-bit encodings (such as TSCII
TSCII
Tamil Script Code for Information Interchange is a coding scheme for representing the Tamil script. The lower 128 codepoints are plain ASCII, the upper 128 codepoints are TSCII-specific...
) prefer the written order. This makes it necessary to reorder when converting from one encoding to another; it is not sufficient simply to map one set of codepoints to the other.
Compound table of Tamil letters
The following table lists vowel (uyir or life) letters across the top and consonant (mei or body) letters along the side, the combination of which gives all Tamil compound (uyirmei) letters. → ↓ |
அ | ஆ | இ | ஈ | உ | ஊ | எ | ஏ | ஐ | ஒ | ஓ | ஔ |
---|---|---|---|---|---|---|---|---|---|---|---|---|
க் | க | கா | கி | கீ | கு | கூ | கெ | கே | கை | கொ | கோ | கௌ |
ங் | ங | ஙா | ஙி | ஙீ | ஙு | ஙூ | ஙெ | ஙே | ஙை | ஙொ | ஙோ | ஙௌ |
ச் | ச | சா | சி | சீ | சு | சூ | செ | சே | சை | சொ | சோ | சௌ |
ஞ் | ஞ | ஞா | ஞி | ஞீ | ஞு | ஞூ | ஞெ | ஞே | ஞை | ஞொ | ஞோ | ஞௌ |
ட் | ட | டா | டி | டீ | டு | டூ | டெ | டே | டை | டொ | டோ | டௌ |
ண் | ண | ணா | ணி | ணீ | ணு | ணூ | ணெ | ணே | ணை | ணொ | ணோ | ணௌ |
த் | த | தா | தி | தீ | து | தூ | தெ | தே | தை | தொ | தோ | தௌ |
ந் | ந | நா | நி | நீ | நு | நூ | நெ | நே | நை | நொ | நோ | நௌ |
ப் | ப | பா | பி | பீ | பு | பூ | பெ | பே | பை | பொ | போ | பௌ |
ம் | ம | மா | மி | மீ | மு | மூ | மெ | மே | மை | மொ | மோ | மௌ |
ய் | ய | யா | யி | யீ | யு | யூ | யெ | யே | யை | யொ | யோ | யௌ |
ர் | ர | ரா | ரி | ரீ | ரு | ரூ | ரெ | ரே | ரை | ரொ | ரோ | ரௌ |
ல் | ல | லா | லி | லீ | லு | லூ | லெ | லே | லை | லொ | லோ | லௌ |
வ் | வ | வா | வி | வீ | வு | வூ | வெ | வே | வை | வொ | வோ | வௌ |
ழ் | ழ | ழா | ழி | ழீ | ழு | ழூ | ழெ | ழே | ழை | ழொ | ழோ | ழௌ |
ள் | ள | ளா | ளி | ளீ | ளு | ளூ | ளெ | ளே | ளை | ளொ | ளோ | ளௌ |
ற் | ற | றா | றி | றீ | று | றூ | றெ | றே | றை | றொ | றோ | றௌ |
ன் | ன | னா | னி | னீ | னு | னூ | னெ | னே | னை | னொ | னோ | னௌ |
→ ↓ |
அ | ஆ | இ | ஈ | உ | ஊ | எ | ஏ | ஐ | ஒ | ஓ | ஔ |
---|---|---|---|---|---|---|---|---|---|---|---|---|
ஶ் | ஶ | ஶா | ஶி | ஶீ | ஶு | ஶூ | ஶெ | ஶே | ஶை | ஶொ | ஶோ | ஶௌ |
ஜ் | ஜ | ஜா | ஜி | ஜீ | ஜு | ஜூ | ஜெ | ஜே | ஜை | ஜொ | ஜோ | ஜௌ |
ஷ் | ஷ | ஷா | ஷி | ஷீ | ஷு | ஷூ | ஷெ | ஷே | ஷை | ஷொ | ஷோ | ஷௌ |
ஸ் | ஸ | ஸா | ஸி | ஸீ | ஸு | ஸூ | ஸெ | ஸே | ஸை | ஸொ | ஸோ | ஸௌ |
ஹ் | ஹ | ஹா | ஹி | ஹீ | ஹு | ஹூ | ஹெ | ஹே | ஹை | ஹொ | ஹோ | ஹௌ |
க்ஷ் | க்ஷ | க்ஷா | க்ஷி | க்ஷீ | க்ஷு | க்ஷூ | க்ஷெ | க்ஷே | க்ஷை | க்ஷொ | க்ஷோ | க்ஷௌ |
Numerals and symbols
Apart from the numerals (0-9), Tamil also has numerals for 10, 100 and 1000. Symbols for day, month, year, debit, credit, as above, rupee, numeral are present as well.0 | 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9 | 10 | 100 | 1000 |
---|---|---|---|---|---|---|---|---|---|---|---|---|
௦ | ௧ | ௨ | ௩ | ௪ | ௫ | ௬ | ௭ | ௮ | ௯ | ௰ | ௱ | ௲ |
day | month | year | debit | credit | as above | rupee | numeral |
---|---|---|---|---|---|---|---|
௳ | ௴ | ௵ | ௶ | ௷ | ௸ | ௹ | ௺ |
Tamil in Unicode
The UnicodeUnicode
Unicode is a computing industry standard for the consistent encoding, representation and handling of text expressed in most of the world's writing systems...
range for Tamil is U+0B80–U+0BFF. Grey areas indicate non-assigned code points. Most of the non-assigned codepoints are designated reserved because they are in the same relative position as characters assigned in other South Asian script blocks that correspond to phonemes that don't exist in the Tamil script.
Like other South Asian scripts in Unicode, the Tamil encoding was originally derived from the ISCII
ISCII
Indian Standard Code for Information Interchange is a coding scheme for representing various writing systems of India. It encodes the main Indic scripts and a Roman transliteration. The supported scripts are: Assamese, Bengali , Devanagari, Gujarati, Gurmukhi, Kannada, Malayalam, Oriya , Tamil,...
standard. Both ISCII
ISCII
Indian Standard Code for Information Interchange is a coding scheme for representing various writing systems of India. It encodes the main Indic scripts and a Roman transliteration. The supported scripts are: Assamese, Bengali , Devanagari, Gujarati, Gurmukhi, Kannada, Malayalam, Oriya , Tamil,...
and Unicode
Unicode
Unicode is a computing industry standard for the consistent encoding, representation and handling of text expressed in most of the world's writing systems...
encode Tamil as an abugida
Abugida
An abugida , also called an alphasyllabary, is a segmental writing system in which consonant–vowel sequences are written as a unit: each unit is based on a consonant letter, and vowel notation is obligatory but secondary...
. In an abugida, each basic character represents a consonant and default vowel. Consonants with a different vowel or bare consonants are represented by adding a modifier character to a base character. Each codepoint representing a similar phoneme is encoded in the same relative position in each South Asian script block in Unicode, including Tamil. Although Unicode represents Tamil as an abugida all the pure consonants (consonants with no associated vowel) and syllables in Tamil can be represented by combining multiple Unicode codepoints, as can be seen in the Unicode Tamil Syllabary below.
In Unicode 5.1, named sequences were added for all Tamil pure consonants and syllables. Unicode 5.1 also has a named sequence for the Tamil ligature SRI (śrī), ஶ்ரீ . The name of this sequence is TAMIL SYLLABLE SHRII, and is composed of the Unicode sequence U+0BB6 U+0BCD U+0BB0 U+0BC0.
Vowels → Consonants ↓ | அ 0B85 | ஆ 0B86 | இ 0B87 | ஈ 0B88 | உ 0B89 | ஊ 0B8A | எ 0B8E | ஏ 0B8F | ஐ 0B90 | ஒ 0B92 | ஓ 0B93 | ஔ 0B94 |
---|---|---|---|---|---|---|---|---|---|---|---|---|
க் 0B95 0BCD | க 0B95 | கா 0B95 0BBE | கி 0B95 0BBF | கீ 0B95 0BC0 | கு 0B95 0BC1 | கூ 0B95 0BC2 | கெ 0B95 0BC6 | கே 0B95 0BC7 | கை 0B95 0BC8 | கொ 0B95 0BCA | கோ 0B95 0BCB | கௌ 0B95 0BCC |
ங் 0B99 0BCD | ங 0B99 | ஙா 0B99 0BBE | ஙி 0B99 0BBF | ஙீ 0B99 0BC0 | ஙு 0B99 0BC1 | ஙூ 0B99 0BC2 | ஙெ 0B99 0BC6 | ஙே 0B99 0BC7 | ஙை 0B99 0BC8 | ஙொ 0B99 0BCA | ஙோ 0B99 0BCB | ஙௌ 0B99 0BCC |
ச் 0B9A 0BCD | ச 0B9A | சா 0B9A 0BBE | சி 0B9A 0BBF | சீ 0B9A 0BC0 | சு 0B9A 0BC1 | சூ 0B9A 0BC2 | செ 0B9A 0BC6 | சே 0B9A 0BC7 | சை 0B9A 0BC8 | சொ 0B9A 0BCA | சோ 0B9A 0BCB | சௌ 0B9A 0BCC |
ஞ் 0B9E 0BCD | ஞ 0B9E | ஞா 0B9E 0BBE | ஞி 0B9E 0BBF | ஞீ 0B9E 0BC0 | ஞு 0B9E 0BC1 | ஞூ 0B9E 0BC2 | ஞெ 0B9E 0BC6 | ஞே 0B9E 0BC7 | ஞை 0B9E 0BC8 | ஞொ 0B9E 0BCA | ஞோ 0B9E 0BCB | ஞௌ 0B9E 0BCC |
ட் 0B9F 0BCD | ட 0B9F | டா 0B9F 0BBE | டி 0B9F 0BBF | டீ 0B9F 0BC0 | டு 0B9F 0BC1 | டூ 0B9F 0BC2 | டெ 0B9F 0BC6 | டே 0B9F 0BC7 | டை 0B9F 0BC8 | டொ 0B9F 0BCA | டோ 0B9F 0BCB | டௌ 0B9F 0BCC |
ண் 0BA3 0BCD | ண 0BA3 | ணா 0BA3 0BBE | ணி 0BA3 0BBF | ணீ 0BA3 0BC0 | ணு 0BA3 0BC1 | ணூ 0BA3 0BC2 | ணெ 0BA3 0BC6 | ணே 0BA3 0BC7 | ணை 0BA3 0BC8 | ணொ 0BA3 0BCA | ணோ 0BA3 0BCB | ணௌ 0BA3 0BCC |
த் 0BA4 0BCD | த 0BA4 | தா 0BA4 0BBE | தி 0BA4 0BBF | தீ 0BA4 0BC0 | து 0BA4 0BC1 | தூ 0BA4 0BC2 | தெ 0BA4 0BC6 | தே 0BA4 0BC7 | தை 0BA4 0BC8 | தொ 0BA4 0BCA | தோ 0BA4 0BCB | தௌ 0BA4 0BCC |
ந் 0BA8 0BCD | ந 0BA8 | நா 0BA8 0BBE | நி 0BA8 0BBF | நீ 0BA8 0BC0 | நு 0BA8 0BC1 | நூ 0BA8 0BC2 | நெ 0BA8 0BC6 | நே 0BA8 0BC7 | நை 0BA8 0BC8 | நொ 0BA8 0BCA | நோ 0BA8 0BCB | நௌ 0BA8 0BCC |
ப் 0BAA 0BCD | ப 0BAA | பா 0BAA 0BBE | பி 0BAA 0BBF | பீ 0BAA 0BC0 | பு 0BAA 0BC1 | பூ 0BAA 0BC2 | பெ 0BAA 0BC6 | பே 0BAA 0BC7 | பை 0BAA 0BC8 | பொ 0BAA 0BCA | போ 0BAA 0BCB | பௌ 0BAA 0BCC |
ம் 0BAE 0BCD | ம 0BAE | மா 0BAE 0BBE | மி 0BAE 0BBF | மீ 0BAE 0BC0 | மு 0BAE 0BC1 | மூ 0BAE 0BC2 | மெ 0BAE 0BC6 | மே 0BAE 0BC7 | மை 0BAE 0BC8 | மொ 0BAE 0BCA | மோ 0BAE 0BCB | மௌ 0BAE 0BCC |
ய் 0BAF 0BCD | ய 0BAF | யா 0BAF 0BBE | யி 0BAF 0BBF | யீ 0BAF 0BC0 | யு 0BAF 0BC1 | யூ 0BAF 0BC2 | யெ 0BAF 0BC6 | யே 0BAF 0BC7 | யை 0BAF 0BC8 | யொ 0BAF 0BCA | யோ 0BAF 0BCB | யௌ 0BAF 0BCC |
ர் 0BB0 0BCD | ர 0BB0 | ரா 0BB0 0BBE | ரி 0BB0 0BBF | ரீ 0BB0 0BC0 | ரு 0BB0 0BC1 | ரூ 0BB0 0BC2 | ரெ 0BB0 0BC6 | ரே 0BB0 0BC7 | ரை 0BB0 0BC8 | ரொ 0BB0 0BCA | ரோ 0BB0 0BCB | ரௌ 0BB0 0BCC |
ல் 0BB2 0BCD | ல 0BB2 | லா 0BB2 0BBE | லி 0BB2 0BBF | லீ 0BB2 0BC0 | லு 0BB2 0BC1 | லூ 0BB2 0BC2 | லெ 0BB2 0BC6 | லே 0BB2 0BC7 | லை 0BB2 0BC8 | லொ 0BB2 0BCA | லோ 0BB2 0BCB | லௌ 0BB2 0BCC |
வ் 0BB5 0BCD | வ 0BB5 | வா 0BB5 0BBE | வி 0BB5 0BBF | வீ 0BB5 0BC0 | வு 0BB5 0BC1 | வூ 0BB5 0BC2 | வெ 0BB5 0BC6 | வே 0BB5 0BC7 | வை 0BB5 0BC8 | வொ 0BB5 0BCA | வோ 0BB5 0BCB | வௌ 0BB5 0BCC |
ழ் 0BB4 0BCD | ழ 0BB4 | ழா 0BB4 0BBE | ழி 0BB4 0BBF | ழீ 0BB4 0BC0 | ழு 0BB4 0BC1 | ழூ 0BB4 0BC2 | ழெ 0BB4 0BC6 | ழே 0BB4 0BC7 | ழை 0BB4 0BC8 | ழொ 0BB4 0BCA | ழோ 0BB4 0BCB | ழௌ 0BB4 0BCC |
ள் 0BB3 0BCD | ள 0BB3 | ளா 0BB3 0BBE | ளி 0BB3 0BBF | ளீ 0BB3 0BC0 | ளு 0BB3 0BC1 | ளூ 0BB3 0BC2 | ளெ 0BB3 0BC6 | ளே 0BB3 0BC7 | ளை 0BB3 0BC8 | ளொ 0BB3 0BCA | ளோ 0BB3 0BCB | ளௌ 0BB3 0BCC |
ற் 0BB1 0BCD | ற 0BB1 | றா 0BB1 0BBE | றி 0BB1 0BBF | றீ 0BB1 0BC0 | று 0BB1 0BC1 | றூ 0BB1 0BC2 | றெ 0BB1 0BC6 | றே 0BB1 0BC7 | றை 0BB1 0BC8 | றொ 0BB1 0BCA | றோ 0BB1 0BCB | றௌ 0BB1 0BCC |
ன் 0BA9 0BCD | ன 0BA9 | னா 0BA9 0BBE | னி 0BA9 0BBF | னீ 0BA9 0BC0 | னு 0BA9 0BC1 | னூ 0BA9 0BC2 | னெ 0BA9 0BC6 | னே 0BA9 0BC7 | னை 0BA9 0BC8 | னொ 0BA9 0BCA | னோ 0BA9 0BCB | னௌ 0BA9 0BCC |
ஶ் 0BB6 0BCD | ஶ 0BB6 | ஶா 0BB6 0BBE | ஶி 0BB6 0BBF | ஶீ 0BB6 0BC0 | ஶு 0BB6 0BC1 | ஶூ 0BB6 0BC2 | ஶெ 0BB6 0BC6 | ஶே 0BB6 0BC7 | ஶை 0BB6 0BC8 | ஶொ 0BB6 0BCA | ஶோ 0BB6 0BCB | ஶௌ 0BB6 0BCC |
ஜ் 0B9C 0BCD | ஜ 0B9C | ஜா 0B9C 0BBE | ஜி 0B9C 0BBF | ஜீ 0B9C 0BC0 | ஜு 0B9C 0BC1 | ஜூ 0B9C 0BC2 | ஜெ 0B9C 0BC6 | ஜே 0B9C 0BC7 | ஜை 0B9C 0BC8 | ஜொ 0B9C 0BCA | ஜோ 0B9C 0BCB | ஜௌ 0B9C 0BCC |
ஷ் 0BB7 0BCD | ஷ 0BB7 | ஷா 0BB7 0BBE | ஷி 0BB7 0BBF | ஷீ 0BB7 0BC0 | ஷு 0BB7 0BC1 | ஷூ 0BB7 0BC2 | ஷெ 0BB7 0BC6 | ஷே 0BB7 0BC7 | ஷை 0BB7 0BC8 | ஷொ 0BB7 0BCA | ஷோ 0BB7 0BCB | ஷௌ 0BB7 0BCC |
ஸ் 0BB8 0BCD | ஸ 0BB8 | ஸா 0BB8 0BBE | ஸி 0BB8 0BBF | ஸீ 0BB8 0BC0 | ஸு 0BB8 0BC1 | ஸூ 0BB8 0BC2 | ஸெ 0BB8 0BC6 | ஸே 0BB8 0BC7 | ஸை 0BB8 0BC8 | ஸொ 0BB8 0BCA | ஸோ 0BB8 0BCB | ஸௌ 0BB8 0BCC |
ஹ் 0BB9 0BCD | ஹ 0BB9 | ஹா 0BB9 0BBE | ஹி 0BB9 0BBF | ஹீ 0BB9 0BC0 | ஹு 0BB9 0BC1 | ஹூ 0BB9 0BC2 | ஹெ 0BB9 0BC6 | ஹே 0BB9 0BC7 | ஹை 0BB9 0BC8 | ஹொ 0BB9 0BCA | ஹோ 0BB9 0BCB | ஹௌ 0BB9 0BCC |
க்ஷ் 0B95 0BCD 0BB7 0BCD | க்ஷ 0B95 0BCD 0BB7 | க்ஷா 0B95 0BCD 0BB7 0BBE | க்ஷி 0B95 0BCD 0BB7 0BBF | க்ஷீ 0B95 0BCD 0BB7 0BC0 | க்ஷு 0B95 0BCD 0BB7 0BC1 | க்ஷூ 0B95 0BCD 0BB7 0BC2 | க்ஷெ 0B95 0BCD 0BB7 0BC6 | க்ஷே 0B95 0BCD 0BB7 0BC7 | க்ஷை 0B95 0BCD 0BB7 0BC8 | ஷொ 0B95 0BCD 0BB7 0BCA | க்ஷோ 0B95 0BCD 0BB7 0BCB | ஷௌ 0B95 0BCD 0BB7 0BCC |
See also
- Tamil numeralsTamil numeralsTamil numerals , refers to the numeral system of the Tamil language used officially in Tamil Nadu, Sri Lanka, Singapore and Mauritius, as well as by the other Tamil-speaking populations around the world including Malaysia, Réunion, and South Africa, and other emigrant communities around the world...
- Tamil units of measurement
- Grantha script
- Tamil letters (on Tamil Wikibooks)
- Tamil bellTamil bellThe Tamil Bell is a broken bronze bell discovered in approximately 1836 by the missionary William Colenso. It was being used as a pot to boil potatoes by Māori women near Whangarei in the Northland Region of New Zealand....
External links
- Tamil Alphabet & Basics (PDF)
- Phonetics of spoken Tamil
- Unicode Chart - For Tamil (PDFPortable Document FormatPortable Document Format is an open standard for document exchange. This file format, created by Adobe Systems in 1993, is used for representing documents in a manner independent of application software, hardware, and operating systems....
) - TACE 16 (PDF)