Romanization of Persian
Encyclopedia
Romanization of Persian is the means by which the Persian language
is represented using the Latin alphabet
. Several different romanization schemes exist, each with its own set of rules driven by its own set of ideological goals.
writing system (with a consonant
-heavy inventory of letters), many distinct words in standard Persian can have identical spellings, with widely varying pronunciations that differ in their (unwritten) vowel
sounds. Thus a romanization paradigm can follow either transliteration (which mirrors spelling and orthography
) or transcription (which mirrors pronunciation and phonology
).
(in the strict sense) attempts to be a complete representation of the original writing, so that an informed reader should be able to reconstruct the original spelling of unknown transliterated words. Transliterations of Persian are used to represent individual Persian words or short quotations, in scholarly texts in English or other languages that do not use the Arabic alphabet.
A transliteration will still have separate representations for different consonants of the Persian alphabet that are pronounced identically in Persian. Therefore transliterations of Persian are often based on transliterations of Arabic. Persian-alphabet vowel representation is also complex, and transliterations are based on the written form.
Transliterations commonly used in the English-speaking world include BGN/PCGN romanization
and ALA-LC Romanization
.
Non-academic English-language quotation of Persian words usually uses a simplification of one of the strict transliteration schemes (typically omitting diacritical marks) and/or unsystematic choices of spellings meant to guide English speakers using English spelling rules towards an approximation of the Persian sounds.
An academic and standardized method for official transliteration of Persian also exists which is called Desphilic Persian Standard Romanization (Desphilic PSR). In this transliteration standard, all Persian words are transliterated to standard Latin-1 characters and therefore can be written using an ordinary English keyboard.
of Persian attempt to straightforwardly represent Persian phonology
in the Roman alphabet, without requiring a close or reversible correspondence with the Perso-Arabic script, and also without requiring a close correspondence to English-language phonetic values of Roman letters; for example, letters such as X, Q, C may be reused
for Persian-language phoneme
s that are not present in English phonology
or do not have a consistent or single-letter English spelling.
Proposed Roman-alphabet scripts intended to be a primary representation of Persian, for use by Persian speakers as an alternative to the Perso-Arabic script, fall into this category. Some of these proposed scripts are described at Omniglot.
The Persian language (Tehrani dialect) has six vowels and twenty-three consonants. The Persian sounds have two specifications:
It is important that use of symbols ' like as one alphabet symbol (it is not necessary only at the beginning of the words or between two vowels of words, but is necessary for Persian transcription in other situation).
One common theme is that in transcriptions of Persian, the unmarked letter a is used for the front vowel æ, while accented or doubled versions of the letter are used for the back vowel ɒː; this is opposite to the conventions in Latin alphabets of Turkic languages
, although similar to some romanizations of Arabic.
approved a Romanization system based on the official guidelines adopted by Iran. This system of rules was later published in 2000 as part of the Toponymic Guidelines for the Islamic Republic of Iran.
In addition to Desphilic equivalency of letters, Desphilic standard defines rules and publishes application notes on how to officially transliterate from standard Persian ( Parsi of the book) and all dialects to ordinary Latin-1 keyboard characters(Ordinary English keyboard).
For writing Persian using this transliteration, there is no need to use a special kind of keyboard, special version of OS or any special software or hardware.
Desphilic standard is a full featured language standard which also defines rules for using and writing Persian pronouns,
Persian verb conjugation, Persian tenses and other Persian grammar subjects. Desphilic also defines a Persian Keyboard
layout which supports Desphilic extended character set [ ä š ö ü ž ğ ķ ] and contributes to Unipers in defining a Universal Persian standard keyboard
.
UniPers, also called Pârsiye Jahâni (literally, "Universal Persian") by its creators, is a proposed Latin-based alphabet
for the Persian language
. The system combines the basic Latin alphabet
plus a few modified letters (Â/â, Š/š, Ž/ž, and an apostrophe
). The UniPers script combines the basic Latin alphabet
plus a thee modified letters (Â/â, Š/š and Ž/ž), and a handful of common-sense rules and recommendations, in order to best represent the sounds of Persian
.
To make reading and writing of the Persian language readily accessible to most users, regardless of their national origin and/or education level. Uni-Pers also defines Persian keyboard layouts
to ease user access to defined characters.
The creators of the system have mentioned that they have the following criteria for their design of the system: serving the Persian language and no other, only using the Latin script, simplicity and ease of use by using a minimal number of diacritical letters and rules, one-to-one correspondence between the sound values of the language and the letters in the system (which may be relaxed in case of š and ž), and conformance with standard pronunciation of the language.
There has also been a recent latin-based alphabet created called Persá that utilizes similar elements with the introduction of new characters with a similar purpose and goal as the Unipers language system.
phonemic Latin-based script that is clear, simple
, and consistent.
To make reading and writing of the Persian language readily accessible to most users, regardless of their national origin and/or education level.
, which he initiated in a general letter on March 12, 1923. The Bahá'í transliteration scheme was based on a standard adopted by the Tenth International Congress of Orientalists which took place in Geneva
in September 1894. Shoghi Effendi changed some details of the Congress's system, most notably in the use of digraphs
in certain cases (e.g. sh instead of š), and in incorporating the solar letters when writing the definite article al- (Arabic: ال) according to pronunciation (e.g. ar-Rahim, as-Saddiq, instead of al-Rahim, al-Saddiq).
This transliteration differs significantly from UniPers, especially in vowel presentation. For example, what is in UniPers "Tehran" is presented in many Bahá'í translations as "Tihran". The name of the Bahá'í women's right activist and martyr "Táhirih
" would be pronounced in Persian according to the UniPers translation "Tahereh", but never printed as "Tahereh" in Bahá'í books. The use of "i" in the case of "Tahereh", illustrates the Bahá'í system's emphasis on literal correspondence with the Persian script, rather than the pronunciation of the modern national language of Iran. A detailed introduction to the Bahá'í Persian romanization can usually be found at the back of a Bahá'í scripture.
or Tajik Persian
is a variety of the Persian language. It was written in Tajik SSR
in a standardized Latin script from 1926 until late 1930s, when the script was officially changed to Cyrillic. However, Tajik phonology differs slightly from that of Persian in Iran; see Persian phonology#Historical shifts.
"Turco-Persian", among its many definitions, can refer to the code-switching
to Persian expressions, Persian literary mannerisms, and heavy use of Persian vocabulary in Anatolian Turkish or Azerbaijani Turkish, especially Ottoman Turkish
, which has a long history of subscribing to the Persian language classical literature. Even though Modern Standard Turkish is ostensibly more pure, it nonetheless retains many Persian mannerisms, Persian vocabulary from Ottoman Turkish
, and has maintained its peculiar way of transcribing Persian words that is "Turkified" in pronunciation and is quite removed from modern standard pronunciation of Persian.
Following are some examples taken from the Turkish Wikipedia :tr:Farsça Sözcükler in explaining differences in spelling between standard Persian transliterated with Turkish Latin Alphabet, and Turco-Persian orthography in the same alphabet:
Following are some lines of Persian poems from the Azeri Wikipedia, with the Azeri Turco-Persian transliteration in bold :az:Cahanşah Həqiqi :az:Səid Səlmasi :az:Məhəmməd Hadi :az:Əbül-üla Gəncəvi:
1.
2.
3.
...
4.
Persian language
Persian is an Iranian language within the Indo-Iranian branch of the Indo-European languages. It is primarily spoken in Iran, Afghanistan, Tajikistan and countries which historically came under Persian influence...
is represented using the Latin alphabet
Latin alphabet
The Latin alphabet, also called the Roman alphabet, is the most recognized alphabet used in the world today. It evolved from a western variety of the Greek alphabet called the Cumaean alphabet, which was adopted and modified by the Etruscans who ruled early Rome...
. Several different romanization schemes exist, each with its own set of rules driven by its own set of ideological goals.
Romanization paradigms
Because the Perso-Arabic script is an abjadAbjad
An abjad is a type of writing system in which each symbol always or usually stands for a consonant; the reader must supply the appropriate vowel....
writing system (with a consonant
Consonant
In articulatory phonetics, a consonant is a speech sound that is articulated with complete or partial closure of the vocal tract. Examples are , pronounced with the lips; , pronounced with the front of the tongue; , pronounced with the back of the tongue; , pronounced in the throat; and ,...
-heavy inventory of letters), many distinct words in standard Persian can have identical spellings, with widely varying pronunciations that differ in their (unwritten) vowel
Vowel
In phonetics, a vowel is a sound in spoken language, such as English ah! or oh! , pronounced with an open vocal tract so that there is no build-up of air pressure at any point above the glottis. This contrasts with consonants, such as English sh! , where there is a constriction or closure at some...
sounds. Thus a romanization paradigm can follow either transliteration (which mirrors spelling and orthography
Orthography
The orthography of a language specifies a standardized way of using a specific writing system to write the language. Where more than one writing system is used for a language, for example Kurdish, Uyghur, Serbian or Inuktitut, there can be more than one orthography...
) or transcription (which mirrors pronunciation and phonology
Phonology
Phonology is, broadly speaking, the subdiscipline of linguistics concerned with the sounds of language. That is, it is the systematic use of sound to encode meaning in any spoken human language, or the field of linguistics studying this use...
).
Transliteration
TransliterationTransliteration
Transliteration is a subset of the science of hermeneutics. It is a form of translation, and is the practice of converting a text from one script into another...
(in the strict sense) attempts to be a complete representation of the original writing, so that an informed reader should be able to reconstruct the original spelling of unknown transliterated words. Transliterations of Persian are used to represent individual Persian words or short quotations, in scholarly texts in English or other languages that do not use the Arabic alphabet.
A transliteration will still have separate representations for different consonants of the Persian alphabet that are pronounced identically in Persian. Therefore transliterations of Persian are often based on transliterations of Arabic. Persian-alphabet vowel representation is also complex, and transliterations are based on the written form.
Transliterations commonly used in the English-speaking world include BGN/PCGN romanization
BGN/PCGN romanization
BGN/PCGN romanization refers to the systems for romanization and Roman-script spelling conventions adopted by the United States Board on Geographic Names and the Permanent Committee on Geographical Names for British Official Use .The systems have been approved by the BGN and the PCGN for...
and ALA-LC Romanization
ALA-LC Romanization
ALA-LC is a set of standards for romanization, or the representation of text in other writing systems using the Latin alphabet. The initials stand for American Library Association - Library of Congress....
.
Non-academic English-language quotation of Persian words usually uses a simplification of one of the strict transliteration schemes (typically omitting diacritical marks) and/or unsystematic choices of spellings meant to guide English speakers using English spelling rules towards an approximation of the Persian sounds.
An academic and standardized method for official transliteration of Persian also exists which is called Desphilic Persian Standard Romanization (Desphilic PSR). In this transliteration standard, all Persian words are transliterated to standard Latin-1 characters and therefore can be written using an ordinary English keyboard.
Transcription
TranscriptionsTranscription (linguistics)
Transcription in the linguistic sense is the systematic representation of language in written form. The source can either be utterances or preexisting text in another writing system, although some linguists only consider the former as transcription.Transcription should not be confused with...
of Persian attempt to straightforwardly represent Persian phonology
Persian phonology
The Persian language has six vowel phonemes and twenty-three consonant phonemes. It features contrastive stress and syllable-final consonant clusters.-Vowels:...
in the Roman alphabet, without requiring a close or reversible correspondence with the Perso-Arabic script, and also without requiring a close correspondence to English-language phonetic values of Roman letters; for example, letters such as X, Q, C may be reused
for Persian-language phoneme
Phoneme
In a language or dialect, a phoneme is the smallest segmental unit of sound employed to form meaningful contrasts between utterances....
s that are not present in English phonology
English phonology
English phonology is the study of the sound system of the English language. Like many languages, English has wide variation in pronunciation, both historically and from dialect to dialect...
or do not have a consistent or single-letter English spelling.
Proposed Roman-alphabet scripts intended to be a primary representation of Persian, for use by Persian speakers as an alternative to the Perso-Arabic script, fall into this category. Some of these proposed scripts are described at Omniglot.
The Persian language (Tehrani dialect) has six vowels and twenty-three consonants. The Persian sounds have two specifications:
- every syllable starts with consonants and
- it is combination of one consonant and one vowel, as in the chart below:
0 | 1 | 2 | 3 | 4 | 5 | 6 | ||
---|---|---|---|---|---|---|---|---|
° | اَ | اِ | اُ | آ | ای | او | ||
a | e | o | ā | i | u | |||
1 | ا ع | ' | 'a | 'e | 'o | 'ā | 'i | 'u |
2 | ب | b | ba | be | bo | bā | bi | bu |
3 | د | d | da | de | do | dā | di | du |
5 | ف | f | fa | fe | fo | fā | fi | fu |
4 | گ | g | ga | ge | go | gā | gi | gu |
6 | ه ح | h | ha | he | ho | hā | hi | hu |
7 | ج | j | ja | je | jo | jā | ji | ju |
8 | ک | k | ka | ke | ko | kā | ki | ku |
9 | ل | l | la | le | lo | lā | li | lu |
10 | م | m | ma | me | mo | mā | mi | mu |
11 | ن | n | na | ne | no | nā | ni | nu |
12 | پ | p | pa | pe | po | pā | pi | pu |
13 | ر | r | ra | re | ro | rā | ri | ru |
14 | س ص ث | s | sa | se | so | sā | si | su |
15 | ت ط | t | ta | te | to | tā | ti | tu |
16 | و | v | va | ve | vo | vā | vi | vu |
17 | ی | y | ya | ye | yo | yā | yi | yu |
18 | ز ذ ض ظ | z | za | ze | zo | zā | zi | zu |
19 | چ | ch | cha | che | cho | chā | chi | chu |
20 | ق غ | gh | gha | ghe | gho | ghā | ghi | ghu |
21 | خ | kh | kha | khe | kho | khā | khi | khu |
22 | ش | sh | sha | she | sho | shā | shi | shu |
23 | ژ | zh | zha | zhe | zho | zhā | zhi | zhu |
It is important that use of symbols ' like as one alphabet symbol (it is not necessary only at the beginning of the words or between two vowels of words, but is necessary for Persian transcription in other situation).
Comparison of proposed Persian and neighboring Latin-based scripts
IPA | Pk | UP | EF | tm Turkmen alphabet The current official Turkmen alphabet as used in Turkmenistan is a Latin alphabet based on the Turkish alphabet, but with notable differences: J is used instead of the Turkish C; Ž is used instead of the Turkish J; Y is used instead of the dotless i ; Ý is used instead of the Turkish consonantal Y;... | az Azerbaijani alphabet In the Republic of Azerbaijan, the Azerbaijani alphabet refers to a Latin alphabet used for writing the Azerbaijani language. This superseded a previous versions based on Cyrillic and Arabic scripts.... | tk Turkish alphabet The Turkish alphabet is a Latin alphabet used for writing the Turkish language, consisting of 29 letters, seven of which have been modified from their Latin originals for the phonetic requirements of the language. This alphabet represents modern Turkish pronunciation with a high degree of accuracy... | ku Kurdish alphabet The Kurdish language is written either using a variant of the Latin alphabet, according to a system introduced by Jeladet Ali Bedirkhan in 1932 , or using a variant of the Persian alphabet, the so-called Sorani alphabet, named for the city of Soran, Iraq.The Hawar is used in Turkey, Syria and... | ASCII | English |
---|---|---|---|---|---|---|---|---|---|
æ | A A A is the first letter and a vowel in the basic modern Latin alphabet. It is similar to the Ancient Greek letter Alpha, from which it derives.- Origins :... a |
Ä Ä "Ä" and "ä" are both characters that represent either a letter from several extended Latin alphabets, or the letter A with an umlaut mark or diaeresis.- Independent letter :... ä |
Ə ə | E e | a | cat | |||
ɒː | Á Á is a letter of the Czech, Faroese, Hungarian, Icelandic, Slovak and Sámi languages. This letter also appears in Dutch, Galician, Irish, Occitan, Portuguese, Spanish, Lakota, Navajo, and Vietnamese as a variant of the letter “a”. Some writers use á incorrectly to denote a quantity, often used on... á |
  is a letter of the Friulian, Romanian, Vietnamese, French, Galician, Portuguese, Frisian, Welsh, Turkish, and Walloon alphabets.- Croatian and Serbian :... â |
à à Ã/ã is a letter used in some languages, generally considered a variant of the letter A.In Portuguese, Ã/ã represents a nasal central unrounded vowel, . The combination ãe represents the Diphthong , and ão represents... ã |
A a | aa | father | |||
ʃ | Sc sc | Š Š The grapheme Š, š is used in various contexts, usually denoting the voiceless postalveolar fricative. In the International Phonetic Alphabet this sound is denoted with , but the lowercase š is used in the Americanist phonetic notation, as well as in the Uralic Phonetic Alphabet.For use in computer... š |
Ş S S is the nineteenth letter in the ISO basic Latin alphabet.-History: Semitic Šîn represented a voiceless postalveolar fricative . Greek did not have this sound, so the Greek sigma came to represent... ş |
sh | ship | ||||
ʒ | Zc zc | Ž Ž The grapheme Ž is formed from Latin Z with the addition of caron . It is used in various contexts, usually denoting the voiced postalveolar fricative, a sound similar to English g in mirage, or Portuguese and French j... ž |
J J Ĵ or ĵ is a letter in Esperanto orthography representing the sound .While Esperanto orthography uses a diacritic for its four postalveolar consonants, as do the Latin-based Slavic alphabets, the base letters are Romano-Germanic... j |
zh | vision | ||||
t͡ʃ | C C Ĉ or ĉ is a consonant in Esperanto orthography, representing the sound .Esperanto orthography uses a diacritic for all four of its postalveolar consonants, as do the Latin-based Slavic alphabets... c |
Ç Ç is a Latin script letter, used in the Albanian, Azerbaijani, Ligurian, Tatar, Turkish, Turkmen, Kurdish and Zazaki alphabets. This letter also appears in Catalan, French, Friulian, Occitan and Portuguese as a variant of the letter “c”... ç |
ch | church | |||||
d͡ʒ | J J Ĵ or ĵ is a letter in Esperanto orthography representing the sound .While Esperanto orthography uses a diacritic for its four postalveolar consonants, as do the Latin-based Slavic alphabets, the base letters are Romano-Germanic... j |
C c | j | judge | |||||
ɣ | Q Q Q is the seventeenth letter of the basic modern Latin alphabet.- History :The Semitic sound value of Qôp was , a sound common to Semitic languages, but not found in English or most Indo-European ones... q |
Ğ G G is the seventh letter in the basic modern Latin alphabet.-History:The letter 'G' was introduced in the Old Latin period as a variant of ⟨c⟩ to distinguish voiced, from voiceless, . The recorded originator of ⟨g⟩ is freedman Spurius Carvilius Ruga, the first Roman to open a fee-paying school,... ğ |
gh | none | |||||
χ | X X X is the twenty-fourth letter in the basic modern Latin alphabet.-Uses:In mathematics, x is commonly used as the name for an independent variable or unknown value. The usage of x to represent an independent or unknown variable can be traced back to the Arabic word šay شيء = “thing,” used in Arabic... x |
X X X is the twenty-fourth letter in the basic modern Latin alphabet.-Uses:In mathematics, x is commonly used as the name for an independent variable or unknown value. The usage of x to represent an independent or unknown variable can be traced back to the Arabic word šay شيء = “thing,” used in Arabic... x |
kh | none | |||||
ʔ | ' | ' | uh-oh |
One common theme is that in transcriptions of Persian, the unmarked letter a is used for the front vowel æ, while accented or doubled versions of the letter are used for the back vowel ɒː; this is opposite to the conventions in Latin alphabets of Turkic languages
Turkic languages
The Turkic languages constitute a language family of at least thirty five languages, spoken by Turkic peoples across a vast area from Eastern Europe and the Mediterranean to Siberia and Western China, and are considered to be part of the proposed Altaic language family.Turkic languages are spoken...
, although similar to some romanizations of Arabic.
Official Iranian
In 1967, the United NationsUnited Nations
The United Nations is an international organization whose stated aims are facilitating cooperation in international law, international security, economic development, social progress, human rights, and achievement of world peace...
approved a Romanization system based on the official guidelines adopted by Iran. This system of rules was later published in 2000 as part of the Toponymic Guidelines for the Islamic Republic of Iran.
Desphilic
Desphilic is one of schemes which targets Persian Standard Romanization (PSR) by transliterating into ordinary English keyboard (Character set of Latin-1). Desphilic Introduced a table for equivalency of letters which corresponds each Perso-Arabic script letter to a Latin-1 charset letter.In addition to Desphilic equivalency of letters, Desphilic standard defines rules and publishes application notes on how to officially transliterate from standard Persian ( Parsi of the book) and all dialects to ordinary Latin-1 keyboard characters(Ordinary English keyboard).
For writing Persian using this transliteration, there is no need to use a special kind of keyboard, special version of OS or any special software or hardware.
Desphilic standard is a full featured language standard which also defines rules for using and writing Persian pronouns,
Persian verb conjugation, Persian tenses and other Persian grammar subjects. Desphilic also defines a Persian Keyboard
Persian keyboard
-Disambiguation:Persian keyboard or PArsic keyboard is a keyboard which supports Uni-Pers characters and is targeted to be used by Fingilish users and Persian speaking chat users.-Persian chat keyboard:...
layout which supports Desphilic extended character set [ ä š ö ü ž ğ ķ ] and contributes to Unipers in defining a Universal Persian standard keyboard
Persian keyboard
-Disambiguation:Persian keyboard or PArsic keyboard is a keyboard which supports Uni-Pers characters and is targeted to be used by Fingilish users and Persian speaking chat users.-Persian chat keyboard:...
.
UniPers
Vowel | as in | Vowel | as in |
---|---|---|---|
A a | /æ/ | I i | /i/ |
 â | /ɒː/ | O o | /o/ |
E e | /e/ | U u | /u/ |
Consonant | as in | Consonant | as in |
B b | /b/ | Q q | /ɣ/ |
C c | /tʃ/ | R r | /ɾ/ |
D d | /d/ | S s | /s/ |
F f | /f/ | Š š | /ʃ/ |
G g | /ɡ/ | T t | /t/ |
H h | /h/ | V v | /v/ |
J j | /dʒ/ | W w | /w/; only used in ow, xw |
K k | /k/ | X x | /χ/ |
L l | /l/ | Y y | /j/ |
M m | /m/ | Z z | /z/ |
N n | /n/ | Ž ž | /ʒ/ |
P p | /p/ | ' | /ʔ/ |
Digraph | as | Diphthong | as in |
xw | x | ow | /oʊ/ |
UniPers, also called Pârsiye Jahâni (literally, "Universal Persian") by its creators, is a proposed Latin-based alphabet
Alphabet
An alphabet is a standard set of letters—basic written symbols or graphemes—each of which represents a phoneme in a spoken language, either as it exists now or as it was in the past. There are other systems, such as logographies, in which each character represents a word, morpheme, or semantic...
for the Persian language
Persian language
Persian is an Iranian language within the Indo-Iranian branch of the Indo-European languages. It is primarily spoken in Iran, Afghanistan, Tajikistan and countries which historically came under Persian influence...
. The system combines the basic Latin alphabet
Latin alphabet
The Latin alphabet, also called the Roman alphabet, is the most recognized alphabet used in the world today. It evolved from a western variety of the Greek alphabet called the Cumaean alphabet, which was adopted and modified by the Etruscans who ruled early Rome...
plus a few modified letters (Â/â, Š/š, Ž/ž, and an apostrophe
Apostrophe
The apostrophe is a punctuation mark, and sometimes a diacritic mark, in languages that use the Latin alphabet or certain other alphabets...
). The UniPers script combines the basic Latin alphabet
Latin alphabet
The Latin alphabet, also called the Roman alphabet, is the most recognized alphabet used in the world today. It evolved from a western variety of the Greek alphabet called the Cumaean alphabet, which was adopted and modified by the Etruscans who ruled early Rome...
plus a thee modified letters (Â/â, Š/š and Ž/ž), and a handful of common-sense rules and recommendations, in order to best represent the sounds of Persian
Persian language
Persian is an Iranian language within the Indo-Iranian branch of the Indo-European languages. It is primarily spoken in Iran, Afghanistan, Tajikistan and countries which historically came under Persian influence...
.
To make reading and writing of the Persian language readily accessible to most users, regardless of their national origin and/or education level. Uni-Pers also defines Persian keyboard layouts
Persian keyboard
-Disambiguation:Persian keyboard or PArsic keyboard is a keyboard which supports Uni-Pers characters and is targeted to be used by Fingilish users and Persian speaking chat users.-Persian chat keyboard:...
to ease user access to defined characters.
The creators of the system have mentioned that they have the following criteria for their design of the system: serving the Persian language and no other, only using the Latin script, simplicity and ease of use by using a minimal number of diacritical letters and rules, one-to-one correspondence between the sound values of the language and the letters in the system (which may be relaxed in case of š and ž), and conformance with standard pronunciation of the language.
There has also been a recent latin-based alphabet created called Persá that utilizes similar elements with the introduction of new characters with a similar purpose and goal as the Unipers language system.
Introduction
The above alphabetic principle makes reading and writing easy, allowing the reader to pronounce words from their spelling, and the writer to spell them from their sounds. The UniPers alphabet and its rules are founded on this fundamental principle. The statements of purpose of the UniPers script are given below:Purpose
To provide the Persian language with a standardStandard language
A standard language is a language variety used by a group of people in their public discourse. Alternatively, varieties become standard by undergoing a process of standardization, during which it is organized for description in grammars and dictionaries and encoded in such reference works...
phonemic Latin-based script that is clear, simple
Simple
Simple may refer to:- Technology :*SIMPLE, an instant messaging protocol*SiMPLE, a programming development system*SIMPLE algorithm, a widely used numerical procedure...
, and consistent.
To make reading and writing of the Persian language readily accessible to most users, regardless of their national origin and/or education level.
Axioms
Here are the 5 axioms of the UniPers script:- The script must serve the Persian language and not the other way around. No other language should be served.
- The alphabet and numbers must be exclusively Latin with additional common diacritical letters and symbols if necessary.
- Simplicity and ease of use. The script must be used for the broadest possible transcription of the Persian sounds with the absolute minimum number of diacritical letters, symbols, and rules.
- Each letter of the alphabet must have a unique basic Persian sound value. Every basic sound of the Persian language must be exclusively represented by a unique letter of the alphabet. No digraphs, ligatures, or redundant letters are allowed.
- The spelling rules and conventions must conform, and in no way be in conflict with, the standard pronunciations and flow of the Persian language.
Baha'i Persian romanization
Bahá'ís use a particular and fairly precise system standardized by Shoghi EffendiShoghi Effendi
Shoghí Effendí Rabbání , better known as Shoghi Effendi, was the Guardian and appointed head of the Bahá'í Faith from 1921 until his death in 1957...
, which he initiated in a general letter on March 12, 1923. The Bahá'í transliteration scheme was based on a standard adopted by the Tenth International Congress of Orientalists which took place in Geneva
Geneva
Geneva In the national languages of Switzerland the city is known as Genf , Ginevra and Genevra is the second-most-populous city in Switzerland and is the most populous city of Romandie, the French-speaking part of Switzerland...
in September 1894. Shoghi Effendi changed some details of the Congress's system, most notably in the use of digraphs
Digraph (orthography)
A digraph or digram is a pair of characters used to write one phoneme or a sequence of phonemes that does not correspond to the normal values of the two characters combined...
in certain cases (e.g. sh instead of š), and in incorporating the solar letters when writing the definite article al- (Arabic: ال) according to pronunciation (e.g. ar-Rahim, as-Saddiq, instead of al-Rahim, al-Saddiq).
This transliteration differs significantly from UniPers, especially in vowel presentation. For example, what is in UniPers "Tehran" is presented in many Bahá'í translations as "Tihran". The name of the Bahá'í women's right activist and martyr "Táhirih
Táhirih
Táhirih or Qurratu'l-`Ayn are both titles of Fátimih Baraghání , an influential poet and theologian of the Bábí Faith in Iran. Her life, influence and execution made her a key figure of the religion...
" would be pronounced in Persian according to the UniPers translation "Tahereh", but never printed as "Tahereh" in Bahá'í books. The use of "i" in the case of "Tahereh", illustrates the Bahá'í system's emphasis on literal correspondence with the Persian script, rather than the pronunciation of the modern national language of Iran. A detailed introduction to the Bahá'í Persian romanization can usually be found at the back of a Bahá'í scripture.
ASCII Internet romanizations
It is common to write Persian language with only English letters especially when commenting in weblogs or when using cellphones to send SMS. One form of such writing is as the following:A a | AA aa | B b | CH ch | D d | E e | F f | G g | H h | I i | |
/æ/ | /ɒː/ | /b/ | /tʃ/ | /d/ | /e/ | /f/ | /ɡ/ | /h/ | /i/ | |
J j | K k | L l | M m | N n | O o | P p | GH gh | R r | S s | |
/dʒ/ | /k/ | /l/ | /m/ | /n/ | /o/ | /p/ | /ɣ/ | /ɾ/ | /s/ | |
SH sh | T t | U u | V v | W w | KH kh | Y y | Z z | ZH zh | ' | |
/ʃ/ | /t/ | /u/ | /v/ | /w/ | /χ/ | /j/ | /z/ | /ʒ/ | /ʔ/ |
Tajik Latin alphabet
The Tajik languageTajik language
Tajik, Tajik Persian, or Tajiki, is a variety of modern Persian spoken in Central Asia. Historically Tajiks called their language zabani farsī , meaning Persian language in English; the term zabani tajikī, or Tajik language, was introduced in the 20th century by the Soviets...
or Tajik Persian
Tajik language
Tajik, Tajik Persian, or Tajiki, is a variety of modern Persian spoken in Central Asia. Historically Tajiks called their language zabani farsī , meaning Persian language in English; the term zabani tajikī, or Tajik language, was introduced in the 20th century by the Soviets...
is a variety of the Persian language. It was written in Tajik SSR
Tajik SSR
The Tajik Soviet Socialist Republic , also known as the Tajik SSR for short, was one of the 15 republics that made up the Soviet Union. Located in Central Asia, the Tajik SSR was created on 5 December 1929 as a national entity for the Tajik people within the Soviet Union...
in a standardized Latin script from 1926 until late 1930s, when the script was officially changed to Cyrillic. However, Tajik phonology differs slightly from that of Persian in Iran; see Persian phonology#Historical shifts.
A a | B ʙ | C c | Ç ç | D d | E e | F f | G g | H h | I i | Ī ī | |
/a/ | /b/ | /tʃ/ | /dʒ/ | /d/ | /e/ | /f/ | /ɡ/ | /ʁ/ | /h/ | /i/ | /ˈi/ |
J j | K k | L l | M m | N n | O o | P p | Q q | R r | S s | Ş ş | T t |
/j/ | /k/ | /l/ | /m/ | /n/ | /o/ | /p/ | /q/ | /ɾ/ | /s/ | /ʃ/ | /t/ |
U u | Ū ū | V v | X x | Z z | ' | ||||||
/u/ | /ɵ/ | /v/ | /χ/ | /z/ | /ʒ/ | /ʔ/ |
Turco-Persian Romanization
Numerals | Cardinal number Cardinal number In mathematics, cardinal numbers, or cardinals for short, are a generalization of the natural numbers used to measure the cardinality of sets. The cardinality of a finite set is a natural number – the number of elements in the set. The transfinite cardinal numbers describe the sizes of infinite... |
Ordinal number Ordinal number In set theory, an ordinal number, or just ordinal, is the order type of a well-ordered set. They are usually identified with hereditarily transitive sets. Ordinals are an extension of the natural numbers different from integers and from cardinals... |
|||||
---|---|---|---|---|---|---|---|
W Arabic numerals Arabic numerals or Hindu numerals or Hindu-Arabic numerals or Indo-Arabic numerals are the ten digits . They are descended from the Hindu-Arabic numeral system developed by Indian mathematicians, in which a sequence of digits such as "975" is read as a numeral... |
A Eastern Arabic numerals The Eastern Arabic numerals are the symbols used to represent the Hindu-Arabic numeral system in conjunction with the Arabic alphabet in the countries of the Arab world.... |
Persian | Turkish | Persian | Persian | Turkish | Persian |
0 | Sefr | Sefr | Seferom | ||||
1 | Yek | Yek | nokhost | Evvel, Yekom | |||
2 | Do | Dü | dovvom | Devvom | |||
3 | Se | Se | sevvom | Sevvom | |||
4 | Cāhār | Çehar | çehārom | Çeharom | |||
5 | Panj | Penc | pancom | Pencom | |||
6 | Şeş | Şeş | şeşom | Şeşom | |||
7 | Haft | Heft | haftom | Heftom | |||
8 | Haşt | Heşt | haştom | Heştom | |||
9 | Noh | Noh | nohom | Nohom | |||
10 | Daḥ | De | dāhom | Dehom | |||
11 | Yāzdah | Yazde | yāzdahom | Yazdehom | |||
12 | Davāzdaḥ | Devazde | davāzdahom | Devazdehom | |||
13 | Sizdah | Sizde | sīzdahom | Sizdehom | |||
14 | Cāḥārdah | Çeharde | çahārdahom | Çehardehom | |||
15 | Pānzdah Punzda |
Panzde | pānzdahom punzdahom |
Panzdehom | |||
16 | Şānzdah | Şanzde | şānzdehom şunzdehom |
Şanzdehom | |||
17 | Hefdah | Hifde | hefdahom | Hifdehom | |||
18 | Hijdah | Hicde | hijdahom | Hicdehom | |||
19 | Nuzdah | Nuzde | nūzdahom | Nuzdehom | |||
20 | Bist | Bist | bīstom | Bistom | |||
30 | Si | Si | sīyom | Siyom | |||
40 | Cehel | Çehel | çehelom | Çehelom | |||
50 | Pānjah | Pencah | pancāhom | Pencahom | |||
60 | Şast | Şest | şastom | Şestom | |||
70 | Haftād | Heftad | haftādom | Heftadom | |||
80 | Haştād | Heştad | haştādom | Heştadom | |||
90 | Navad | Neved | navadom | Nevedom | |||
100 | Sad | Sed | sadom | Sedom | |||
200 | Devist | Divist | devīstom | Divistom | |||
300 | Sisad | Sised | sīsadom | Sisedom | |||
400 | Cāhārsad | Çeharsed | çahār sadom | Çehar sedom | |||
500 | Pān sad Pun sad |
Pan sed | pānsadom punsadom |
Pansedom | |||
600 | Şeş sad | Şeş sed | şeş sadom | Şeş sedom | |||
700 | Haft sad | Heft sed | haft sadom | Heft sedom | |||
800 | ḥaşt sad | Heşt sed | haşt sadom | Heşt sedom | |||
900 | Noh sad | Noh sed | noh sadom | Noh sedom | |||
1000 | Hezār | Hezar | hazārom | Hezarom |
"Turco-Persian", among its many definitions, can refer to the code-switching
Code-switching
In linguistics, code-switching is the concurrent use of more than one language, or language variety, in conversation. Multilinguals—people who speak more than one language—sometimes use elements of multiple languages in conversing with each other...
to Persian expressions, Persian literary mannerisms, and heavy use of Persian vocabulary in Anatolian Turkish or Azerbaijani Turkish, especially Ottoman Turkish
Ottoman Turkish language
The Ottoman Turkish language or Ottoman language is the variety of the Turkish language that was used for administrative and literary purposes in the Ottoman Empire. It borrows extensively from Arabic and Persian, and was written in a variant of the Perso-Arabic script...
, which has a long history of subscribing to the Persian language classical literature. Even though Modern Standard Turkish is ostensibly more pure, it nonetheless retains many Persian mannerisms, Persian vocabulary from Ottoman Turkish
Ottoman Turkish language
The Ottoman Turkish language or Ottoman language is the variety of the Turkish language that was used for administrative and literary purposes in the Ottoman Empire. It borrows extensively from Arabic and Persian, and was written in a variant of the Perso-Arabic script...
, and has maintained its peculiar way of transcribing Persian words that is "Turkified" in pronunciation and is quite removed from modern standard pronunciation of Persian.
Following are some examples taken from the Turkish Wikipedia :tr:Farsça Sözcükler in explaining differences in spelling between standard Persian transliterated with Turkish Latin Alphabet, and Turco-Persian orthography in the same alphabet:
Following are some lines of Persian poems from the Azeri Wikipedia, with the Azeri Turco-Persian transliteration in bold :az:Cahanşah Həqiqi :az:Səid Səlmasi :az:Məhəmməd Hadi :az:Əbül-üla Gəncəvi:
1.
- Vüsalını diləram kam ilən ze fəzli-ilah
- Məni-şikəstəyə kami-vüsal beylə gərək.
- Ey xətin səb'ül-məsani, vey ləbin mai-təhur,
- Vey cəmalın pərtövindən sərbəsər aləmdə nur.
2.
- Mən on zəmini guhərbari-paki İranəm,
- Bə hər bəlayi-cəhalət nişəgəh əst təni mən...
3.
- Məkatib cilvəgahi -tələəti-fəyyazi-qüdrətdir,
- Məkatib pərtövü-ənvari-şəmsi-sübhi-vəhdətdir
...
- Ey dəsti-sitəmkar, ayə pənceyi-mənhus!..
4.
- Mara şәst salәst kәz xake-İran
- Bovәd şanzdәh ta be Şirvan fetadәm.
See also
- FingilishFingilishPenglish, Pinglish, Fingilish or Fargelisi is a term used to describe the way Persian words are written using the Latin alphabet , or generally the casual romanization of Persian words popularized after computers, emailing and online chat became...
(PersianPersian languagePersian is an Iranian language within the Indo-Iranian branch of the Indo-European languages. It is primarily spoken in Iran, Afghanistan, Tajikistan and countries which historically came under Persian influence...
chat alphabet ) - Persian alphabetPerso-Arabic scriptThe Persian or Perso-Arabic alphabet is a writing system based on the Arabic script. Originally used exclusively for the Arabic language, the Arabic alphabet was adapted to the Persian language, adding four letters: , , , and . Many languages which use the Perso-Arabic script add other letters...
- Persian phonologyPersian phonologyThe Persian language has six vowel phonemes and twenty-three consonant phonemes. It features contrastive stress and syllable-final consonant clusters.-Vowels:...
- RomanizationRomanizationIn linguistics, romanization or latinization is the representation of a written word or spoken speech with the Roman script, or a system for doing so, where the original word or language uses a different writing system . Methods of romanization include transliteration, for representing written...
- TransliterationTransliterationTransliteration is a subset of the science of hermeneutics. It is a form of translation, and is the practice of converting a text from one script into another...
- Romanization of Arabic
- Desphilic
External links
- Comparison of DMG, UN, ALA-LC, BGN/PCGN, EI, ISO 233-3 transliterations
- UN Romanization of Persian for Geographical Names
- http://www.loc.gov/catdir/cpso/romanization/persian.pdf Library of Congress/American Library Association Romanization of Persian]
- Cataloguing Issues and Problems
- UniPers homepage
- Eurofarsi
- eiktub: web-based Arabic transliteration pad, with support for Persian characters
- Desphilic Persian Standard Romanization home