Arabic Unicode
Encyclopedia
As of Unicode
6.0, the following blocks
encode Arabic characters
:
The basic Arabic range encodes the standard letters and diacritics, but does not encode contextual forms (U+0621–U+0652 being directly based on ISO 8859-6); and also includes the most common diacritics and Arabic-Indic digits.
The Arabic Supplement range encodes letter variants mostly used for writing African (non-Arabic) languages.
The Arabic Presentation Forms-A range encodes contextual forms and ligatures of letter variants needed for Persian, Urdu, Sindhi and Central Asian languages.
The Arabic Presentation Forms-B range encodes spacing forms of Arabic diacritics, and more contextual letter forms.
The presentation forms are present only for compatibility with older standards, and are not currently needed for coding text.
Unicode
Unicode is a computing industry standard for the consistent encoding, representation and handling of text expressed in most of the world's writing systems...
6.0, the following blocks
Unicode block
In Unicode, a block is defined as one contiguous range of code points. Blocks are named uniquely and have no overlap. They may be defined with the starting and ending code points. The block explicitly can include code points that are unassigned and non-characters. Code points not belonging to any...
encode Arabic characters
Arabic alphabet
The Arabic alphabet or Arabic abjad is the Arabic script as it is codified for writing the Arabic language. It is written from right to left, in a cursive style, and includes 28 letters. Because letters usually stand for consonants, it is classified as an abjad.-Consonants:The Arabic alphabet has...
:
- Arabic (0600—06FF, 224 characters)
- Arabic Supplement (0750—077F, 48 characters)
- Arabic Presentation Forms-A (FB50—FDFF, 608 characters)
- Arabic Presentation Forms-B (FE70—FEFF, 140 characters)
- Rumi Numeral SymbolsRumi calendarThis is about the solar Ottoman calendar based on the Julian calendar. For the lunar Hijri calendar see Islamic calendar.The Rumi calendar , a specific calendar based on the Julian calendar but starting with the year of Muhammad's emigration in 622 AD, was officially used by the Ottoman Empire...
(10E60—10E7F, 31 characters)
The basic Arabic range encodes the standard letters and diacritics, but does not encode contextual forms (U+0621–U+0652 being directly based on ISO 8859-6); and also includes the most common diacritics and Arabic-Indic digits.
The Arabic Supplement range encodes letter variants mostly used for writing African (non-Arabic) languages.
The Arabic Presentation Forms-A range encodes contextual forms and ligatures of letter variants needed for Persian, Urdu, Sindhi and Central Asian languages.
The Arabic Presentation Forms-B range encodes spacing forms of Arabic diacritics, and more contextual letter forms.
The presentation forms are present only for compatibility with older standards, and are not currently needed for coding text.
Contextual forms
A demonstration for the basic alphabet used in Modern Standard Arabic:General Unicode |
Contextual forms | Name | |||
---|---|---|---|---|---|
Isolated | End | Middle | Beginning | ||
0627 |
FE8D |
FE8E |
|||
0628 |
FE8F |
FE90 |
FE92 |
FE91 |
|
062A |
FE95 |
FE96 |
FE98 |
FE97 |
|
062B |
FE99 |
FE9A |
FE9C |
FE9B |
|
062C |
FE9D |
FE9E |
FEA0 |
FE9F |
|
062D |
FEA1 |
FEA2 |
FEA4 |
FEA3 |
|
062E |
FEA5 |
FEA6 |
FEA8 |
FEA7 |
|
062F |
FEA9 |
FEAA |
|||
0630 |
FEAB |
FEAC |
|||
0631 |
FEAD |
FEAE |
|||
0632 |
FEAF |
FEB0 |
|||
0633 |
FEB1 |
FEB2 |
FEB4 |
FEB3 |
|
0634 |
FEB5 |
FEB6 |
FEB8 |
FEB7 |
|
0635 |
FEB9 |
FEBA |
FEBC |
FEBB |
|
0636 |
FEBD |
FEBE |
FEC0 |
FEBF |
|
0637 |
FEC1 |
FEC2 |
FEC4 |
FEC3 |
|
0638 |
FEC5 |
FEC6 |
FEC8 |
FEC7 |
|
0639 |
FEC9 |
FECA |
FECC |
FECB |
|
063A |
FECD |
FECE |
FED0 |
FECF |
|
0641 |
FED1 |
FED2 |
FED4 |
FED3 |
|
0642 |
FED5 |
FED6 |
FED8 |
FED7 |
|
0643 |
FED9 |
FEDA |
FEDC |
FEDB |
|
0644 |
FEDD |
FEDE |
FEE0 |
FEDF |
|
0645 |
FEE1 |
FEE2 |
FEE4 |
FEE3 |
|
0646 |
FEE5 |
FEE6 |
FEE8 |
FEE7 |
|
0647 |
FEE9 |
FEEA |
FEEC |
FEEB |
|
0648 |
FEED |
FEEE |
|||
064A |
FEF1 |
FEF2 |
FEF4 |
FEF3 |
|
0622 |
FE81 |
FE82 |
|||
0629 |
FE93 |
FE94 |
— | — | |
0649 |
FEEF |
FEF0 |
— | — |
Punctuation and ornaments
Only the Arabic comma is used in regular Arabic typing, which can also be substituted with the normal comma used in Latin-based scripts atU+002c
.
- U+060C : "ARABIC COMMA"
- U+060D : "ARABIC DATE SEPARATOR"
- U+060E : "ARABIC POETIC VERSE BEGIN"
- U+060F : "ARABIC SIGN MISRA"
- U+066D ٭: "ARABIC FIVE POINTED STAR"
- U+06DD : "ARABIC END OF AYAH"
- U+06DE : "ARABIC START OF RUB EL HIZB"
- U+06E9 : "ARABIC ARABIC PLACE OF SAJDAH"
- U+FD3E ﴾: "ARABIC ORNATE LEFT PARENTHESIS"
- U+FD3F ﴿: "ARABIC ORNATE RIGHT PARENTHESIS"
Word ligatures
Arabic Presentation Forms-A has a few characters defined as "word ligatures" for terms frequently used in formulaic expressions in Arabic. They are rarely used out of professional liturgical typing, also the Rial grapheme is normally written fully, not by the ligature.- U+FDF0 : "SALLA USED AS KORANIC STOP SIGN ISOLATED FORM"
- U+FDF1 : "QALA USED AS KORANIC STOP SIGN ISOLATED FORM"
- U+FDF2 : "ALLAH ISOLATED FORM" -- .
- U+FDF3 : "AKBAR ISOLATED FORM"
- U+FDF4 : "MOHAMMED ISOLATED FORM"
- U+FDF5 : "SALAM ISOLATED FORM" "peace be upon him"
- U+FDF6 : "RASOUL ISOLATED FORM"
- U+FDF7 : "ALAYHE"
- U+FDF8 : "WASALLAM"
- U+FDF9 : "SALLA ISLOATED FORM"
- U+FDFA : "SALLALLAHOU ALAYHA WASALLAM" "peace be upon him"
- U+FDFB : "JALLAJALALOUHOU"
- U+FDFC : the RialRialRial or RIAL may refer to:* Rial Old English for Royal. Geoffrey Chaucer used either Rial or Ryal as in "his rial majesty" when referring to the King...
currency sign - U+FDFD : the BasmalaBasmalaBasmala or Bismillah is an Arabic noun used as a collective name for the whole of the recurring Islamic phrase b-ismi-llāhi r-raḥmāni r-raḥīmi, It is sometimes translated as "In the name of God, Most Gracious, Most Merciful"...