Western Latin character sets (computing)

Several binary representations of character sets for common Western European languages are compared in this article. These encodings were designed for representation of Italian

Italian language

Italian is a Romance language spoken mainly in Europe: Italy, Switzerland, San Marino, Vatican City, by minorities in Malta, Monaco, Croatia, Slovenia, France, Libya, Eritrea, and Somalia, and by immigrant communities in the Americas and Australia...

, Spanish

Spanish language

Spanish , also known as Castilian , is a Romance language in the Ibero-Romance group that evolved from several languages and dialects in central-northern Iberia around the 9th century and gradually spread with the expansion of the Kingdom of Castile into central and southern Iberia during the...

, Portuguese

Portuguese language

Portuguese is a Romance language that arose in the medieval Kingdom of Galicia, nowadays Galicia and Northern Portugal. The southern part of the Kingdom of Galicia became independent as the County of Portugal in 1095...

, French

French language

French is a Romance language spoken as a first language in France, the Romandy region in Switzerland, Wallonia and Brussels in Belgium, Monaco, the regions of Quebec and Acadia in Canada, and by various communities elsewhere. Second-language speakers of French are distributed throughout many parts...

, German

German language

German is a West Germanic language, related to and classified alongside English and Dutch. With an estimated 90 – 98 million native speakers, German is one of the world's major languages and is the most widely-spoken first language in the European Union....

, Dutch

Dutch language

Dutch is a West Germanic language and the native language of the majority of the population of the Netherlands, Belgium, and Suriname, the three member states of the Dutch Language Union. Most speakers live in the European Union, where it is a first language for about 23 million and a second...

, English

English language

English is a West Germanic language that arose in the Anglo-Saxon kingdoms of England and spread into what was to become south-east Scotland under the influence of the Anglian medieval kingdom of Northumbria...

, Danish

Danish language

Danish is a North Germanic language spoken by around six million people, principally in the country of Denmark. It is also spoken by 50,000 Germans of Danish ethnicity in the northern parts of Schleswig-Holstein, Germany, where it holds the status of minority language...

, Swedish

Swedish language

Swedish is a North Germanic language, spoken by approximately 10 million people, predominantly in Sweden and parts of Finland, especially along its coast and on the Åland islands. It is largely mutually intelligible with Norwegian and Danish...

, Norwegian

Norwegian language

Norwegian is a North Germanic language spoken primarily in Norway, where it is the official language. Together with Swedish and Danish, Norwegian forms a continuum of more or less mutually intelligible local and regional variants .These Scandinavian languages together with the Faroese language...

, and Icelandic

Icelandic language

Icelandic is a North Germanic language, the main language of Iceland. Its closest relative is Faroese.Icelandic is an Indo-European language belonging to the North Germanic or Nordic branch of the Germanic languages. Historically, it was the westernmost of the Indo-European languages prior to the...

, which use the Latin alphabet

Latin alphabet

The Latin alphabet, also called the Roman alphabet, is the most recognized alphabet used in the world today. It evolved from a western variety of the Greek alphabet called the Cumaean alphabet, which was adopted and modified by the Etruscans who ruled early Rome...

, a few additional letters and ones with precomposed diacritic

Diacritic

A diacritic is a glyph added to a letter, or basic glyph. The term derives from the Greek διακριτικός . Diacritic is both an adjective and a noun, whereas diacritical is only an adjective. Some diacritical marks, such as the acute and grave are often called accents...

s, some punctuation, and various symbols (including some Greek letters). Although they're called "Western European" many of these languages are spoken all over the world. Also, these character sets happen to support many other languages such as Malay

Malay language

Malay is a major language of the Austronesian family. It is the official language of Malaysia , Indonesia , Brunei and Singapore...

, Swahili

Swahili language

Swahili or Kiswahili is a Bantu language spoken by various ethnic groups that inhabit several large stretches of the Mozambique Channel coastline from northern Kenya to northern Mozambique, including the Comoro Islands. It is also spoken by ethnic minority groups in Somalia...

, or Classical Latin.

Summary

The ISO-8859 series of 8-bit character sets encodes all Latin character sets used in Europe

Europe

Europe is, by convention, one of the world's seven continents. Comprising the westernmost peninsula of Eurasia, Europe is generally 'divided' from Asia to its east by the watershed divides of the Ural and Caucasus Mountains, the Ural River, the Caspian and Black Seas, and the waterways connecting...

, albeit that the same code points have multiple uses that caused some difficulty. The arrival of Unicode

Unicode

Unicode is a computing industry standard for the consistent encoding, representation and handling of text expressed in most of the world's writing systems...

, with a unique code point for every glyph

Glyph

A glyph is an element of writing: an individual mark on a written medium that contributes to the meaning of what is written. A glyph is made up of one or more graphemes....

, resolved these issues.

ISO/IEC 8859-1
ISO/IEC 8859-1
ISO/IEC 8859-1:1998, Information technology — 8-bit single-byte coded graphic character sets — Part 1: Latin alphabet No. 1, is part of the ISO/IEC 8859 series of ASCII-based standard character encodings, first edition published in 1987. It is informally referred to as Latin-1. It is generally...

or Latin-1 is the most used and also defines the first 256 codes in Unicode
Unicode
Unicode is a computing industry standard for the consistent encoding, representation and handling of text expressed in most of the world's writing systems...
ISO/IEC 8859-15
ISO/IEC 8859-15
ISO/IEC 8859-15:1999, Information technology — 8-bit single-byte coded graphic character sets — Part 15: Latin alphabet No. 9, is part of the ISO/IEC 8859 series of ASCII-based standard character encodings, first edition published in 1999. It is informally referred to as Latin-9...

modifies ISO-8859-1 to support Finnish and French and add the euro sign
Euro sign
The euro sign is the currency sign used for the euro, the official currency of the Eurozone in the European Union . The design was presented to the public by the European Commission on 12 December 1996. The international three-letter code for the euro is EUR...

.
In terms of printable characters Windows-1252
Windows-1252
Windows-1252 or CP-1252 is a character encoding of the Latin alphabet, used by default in the legacy components of Microsoft Windows in English and some other Western languages. It is one version within the group of Windows code pages...

has everything ISO-8859-1 and ISO-8859-15 have and more.
IBM CP437
Code page 437
IBM PC or MS-DOS code page 437 is the character set of the original IBM PC. It is also known as CP 437, OEM 437, PC-8, MS-DOS Latin US or sometimes misleadingly referred to as the OEM font, High ASCII or Extended ASCII....

, being intended for English only, has very little in the way of accented letters but has far more graphics characters than the others and also some Greek characters that are useful as technical symbols.
IBM CP850
Code page 850
Code page 850 is a code page used under MS-DOS in Western Europe. It is the code page commonly used by the version of MS-DOS underlying Windows ME...

has all the printable characters that ISO-8859-1 has (albeit arranged differently) and still manages to have enough graphics characters to build a usable text-mode user interface.
IBM CP858
Code page 858
Code page 858 is a code page used under MS-DOS to write Western European languages.Code page 858 was created from code page 850 in 1998 by changing code point 213 from dotless I ⟨ı⟩ to the euro sign ⟨€⟩....

differs from CP850 only by one character — a rarely-used dotless i (ı) was replaced by euro currency sign (€).
IBM code pages 037
EBCDIC 037
IBM code page 37 is an EBCDIC code page with the full Latin-1 character set used in IBM mainframes. It is used in some English and Portuguese speaking countries, including Australia, Brazil, Canada, New Zealand, Portugal, South Africa, and the United States....

, 500
EBCDIC 500
IBM code page 500 is an EBCDIC code page with full Latin-1-charset used in IBM mainframes.CCSID 1148 is the Euro currency update of code page/CCSID 500. Byte 9F is replaced ¤ with € in that code page.-Codepage layout:...

, and 1047
EBCDIC 1047
Code page 01047 is an EBCDIC code page with the full Latin-1 character set.It is possible to translate the character codes from the CP 01047 charset to ISO 8859-1 character codes, so that translation back to the CP 01047 charset is an exact value-preserving round-trip conversion....

are EBCDIC
EBCDIC
Extended Binary Coded Decimal Interchange Code is an 8-bit character encoding used mainly on IBM mainframe and IBM midrange computer operating systems....

encodings that include all of the ISO-8859-1 characters.
The Mac OS Roman
Mac OS Roman
Mac OS Roman is a character encoding primarily used by Mac OS to represent text. It encodes 256 characters, the first 128 of which are identical to ASCII, with the remaining characters including mathematical symbols, diacritics, and additional punctuation marks. It is suitable for use to represent...

character set (often referred to as MacRoman and known by the IANA
Internet Assigned Numbers Authority
The Internet Assigned Numbers Authority is the entity that oversees global IP address allocation, autonomous system number allocation, root zone management in the Domain Name System , media types, and other Internet Protocol-related symbols and numbers...

as simply MACINTOSH) has most, but not all, of the same characters as ISO-8859-1 but in a very different arrangement; and it also adds many technical and mathematical characters and more diacritics. Older Macintosh web browser
Web browser
A web browser is a software application for retrieving, presenting, and traversing information resources on the World Wide Web. An information resource is identified by a Uniform Resource Identifier and may be a web page, image, video, or other piece of content...

s were known to munge the few characters that were in ISO-8859-1 but not their native Macintosh character set when editing text from Web sites. Conversely, in Web material prepared on an older Macintosh, many characters were displayed incorrectly when read by other operating systems.
The euro sign
Euro sign
The euro sign is the currency sign used for the euro, the official currency of the Eurozone in the European Union . The design was presented to the public by the European Commission on 12 December 1996. The international three-letter code for the euro is EUR...

post-dates these (ISO-8859 specifications: conflicting ways to retrofit it led to significant difficulty until Unicode became more generally adopted.

History

The earlier seven-bit

Bit

A bit is the basic unit of information in computing and telecommunications; it is the amount of information stored by a digital device or other physical system that exists in one of two possible distinct states...

U.S. ASCII

ASCII

The American Standard Code for Information Interchange is a character-encoding scheme based on the ordering of the English alphabet. ASCII codes represent text in computers, communications equipment, and other devices that use text...

encoding has characters sufficient to properly represent only US-English, Latin, and Swahili. It is missing some letters and letter-diacritic combinations used in other Latin-alphabet languages. However, since there was no other choice on most U.S.-supplied computer platforms, ASCII was unavoidable in most of the non-English-speaking world (seven-bit encoding was necessitated by the limitations of early computing networks). There was the ISO 646 group of encodings which replaced some of the symbols in ASCII with local characters, but space was very limited, and some of the symbols replaced were quite common in things like programming languages.

Although seven-bit communication was the norm, most computers internally used eight-bit bytes, and they mostly put some form of characters in the 128 higher byte positions. In the early days most of these were system specific, but gradually a few standards were settled on.

In recent years, as storage and memory costs fall, the issues associated with multiple meanings of a given eight-bit code (there are seven ISO-Latin code sets alone) have ceased to be justified. All major operating systems have moved to Unicode

Unicode

Unicode is a computing industry standard for the consistent encoding, representation and handling of text expressed in most of the world's writing systems...

as their main internal representation. However at least on Windows many applications continue to use the non-Unicode versions of the API calls.

The euro sign

The coming of the euro

Euro

The euro is the official currency of the eurozone: 17 of the 27 member states of the European Union. It is also the currency used by the Institutions of the European Union. The eurozone consists of Austria, Belgium, Cyprus, Estonia, Finland, France, Germany, Greece, Ireland, Italy, Luxembourg,...

introduced significant pressure to support the euro sign (€), and most character sets had to be adapted in some way.

MacRoman simply replaced the generic currency sign (¤). This caused significant difficulty because organisations had found other uses for it, such as the company logo.
ISO introduced ISO 8859-15, which replaced the generic currency sign with the euro sign as well as making some other replacements of symbols with letters with diacritics.
Windows-1252
Windows-1252
Windows-1252 or CP-1252 is a character encoding of the Latin alphabet, used by default in the legacy components of Microsoft Windows in English and some other Western languages. It is one version within the group of Windows code pages...

simply placed the euro sign in a gap (position 80_hex) in the existing C1 control codes
C0 and C1 control codes
Most character encodings, in addition to representing printable characters, may also represent additional information about the text, such as the position of a cursor, an instruction to start a new line, or a message that the text has been received...

.

Comparison table

Code points to U+007F are not shown in this table currently, as they are directly mapped in all character sets listed here. The ASCII coding standard defines the original specification for the mapping of the first 0-127 characters.

The table is arranged by Unicode

Unicode

Unicode is a computing industry standard for the consistent encoding, representation and handling of text expressed in most of the world's writing systems...

code point. Character sets are referred to here by their IANA

Internet Assigned Numbers Authority

The Internet Assigned Numbers Authority is the entity that oversees global IP address allocation, autonomous system number allocation, root zone management in the Domain Name System , media types, and other Internet Protocol-related symbols and numbers...

names in upper case.

Character	Code point	ISO-8859-1	ISO-8859-15	WINDOWS-1252 Windows-1252 Windows-1252 or CP-1252 is a character encoding of the Latin alphabet, used by default in the legacy components of Microsoft Windows in English and some other Western languages. It is one version within the group of Windows code pages...	IBM437	IBM850	MACINTOSH Mac OS Roman Mac OS Roman is a character encoding primarily used by Mac OS to represent text. It encodes 256 characters, the first 128 of which are identical to ASCII, with the remaining characters including mathematical symbols, diacritics, and additional punctuation marks. It is suitable for use to represent...
NBSP	U+00A0	A0	A0	A0	FF	FF	CA
¡	U+00A1	A1	A1	A1	AD	AD	C1
¢	U+00A2	A2	A2	A2	9B	BD	A2
£ Pound sign The pound sign is the symbol for the pound sterling—the currency of the United Kingdom . The same symbol is used for similarly named currencies in some other countries and territories, such as the Irish pound, Gibraltar pound, Australian pound and the Italian lira...	U+00A3	A3	A3	A3	9C	9C	A3
¤	U+00A4	A4		A4		CF
¥ ¥ ¥ is a currency sign used by the Japanese yen and the Chinese yuan currencies. The symbol resembles a Latin letter Y with a double stroke. The base unit of both currencies shared the same Chinese character pronounced yuán in Mandarin Chinese and en in Standard Japanese...	U+00A5	A5	A5	A5	9D	BE	B4
¦	U+00A6	A6		A6		DD
§	U+00A7	A7	A7	A7		F5	A4
¨	U+00A8	A8		A8		F9	AC
©	U+00A9	A9	A9	A9		B8	A9
ª	U+00AA	AA	AA	AA	A6	A6	BB
«	U+00AB	AB	AB	AB	AE	AE	C7
¬	U+00AC	AC	AC	AC	AA	AA	C2
SHY Soft hyphen In computing and typesetting, a soft hyphen is a type of hyphen used to specify a place in text where a hyphenated break is allowed without forcing a line break in an inconvenient place if the text is re-flowed....	U+00AD	AD	AD	AD		F0
®	U+00AE	AE	AE	AE		A9	A8
¯	U+00AF	AF	AF	AF		EE	F8
Character	Code point	ISO-8859-1	ISO-8859-15	WINDOWS-1252	IBM437	IBM850	MACINTOSH
° ° ˚ "modifier letter ring above" is a character of the Spacing Modifier Letters range .It is used in the transliteration of Abkhaz to represent the letter ....	U+00B0	B0	B0	B0	F8	F8	A1
±	U+00B1	B1	B1	B1	F1	F1	B1
²	U+00B2	B2	B2	B2	FD	FD
³	U+00B3	B3	B3	B3		FC
´	U+00B4	B4		B4		EF	AB
µ	U+00B5	B5	B5	B5	E6	E6	B5
¶	U+00B6	B6	B6	B6		F4	A6
·	U+00B7	B7	B7	B7	FA	FA	E1
¸	U+00B8	B8		B8		F7	FC
¹	U+00B9	B9	B9	B9		FB
º	U+00BA	BA	BA	BA	A7	A7	BC
»	U+00BB	BB	BB	BB	AF	AF	C8
¼	U+00BC	BC		BC	AC	AC
½	U+00BD	BD		BD	AB	AB
¾	U+00BE	BE		BE		F3
¿	U+00BF	BF	BF	BF	A8	A8	C0
Character	Code point	ISO-8859-1	ISO-8859-15	WINDOWS-1252	IBM437	IBM850	MACINTOSH
À À is a letter of the Catalan, French, Galician, Italian, Portuguese, Scottish Gaelic and Vietnamese languages, consisting of the Latin letter A and a grave accent. À is also used in Pinyin transliteration. In most languages, it represents the vowel a. This letter is also a letter in Taos.When...	U+00C0	C0	C0	C0		B7	CB
Á Á is a letter of the Czech, Faroese, Hungarian, Icelandic, Slovak and Sámi languages. This letter also appears in Dutch, Galician, Irish, Occitan, Portuguese, Spanish, Lakota, Navajo, and Vietnamese as a variant of the letter “a”. Some writers use á incorrectly to denote a quantity, often used on...	U+00C1	C1	C1	C1		B5	E7
Â Â is a letter of the Friulian, Romanian, Vietnamese, French, Galician, Portuguese, Frisian, Welsh, Turkish, and Walloon alphabets.- Croatian and Serbian :...	U+00C2	C2	C2	C2		B6	E5
Ã Ã Ã/ã is a letter used in some languages, generally considered a variant of the letter A.In Portuguese, Ã/ã represents a nasal central unrounded vowel, . The combination ãe represents the Diphthong , and ão represents...	U+00C3	C3	C3	C3		C7	CC
Ä Ä "Ä" and "ä" are both characters that represent either a letter from several extended Latin alphabets, or the letter A with an umlaut mark or diaeresis.- Independent letter :...	U+00C4	C4	C4	C4	8E	8E	80
Å Å Å represents various sounds in several languages. Å is part of the alphabets used for the Alemannic and the Bavarian-Austrian dialects of German...	U+00C5	C5	C5	C5	8F	8F	81
Æ Æ Æ is a grapheme formed from the letters a and e. Originally a ligature representing a Latin diphthong, it has been promoted to the full status of a letter in the alphabets of some languages, including Danish, Faroese, Norwegian and Icelandic...	U+00C6	C6	C6	C6	92	92	AE
Ç Ç is a Latin script letter, used in the Albanian, Azerbaijani, Ligurian, Tatar, Turkish, Turkmen, Kurdish and Zazaki alphabets. This letter also appears in Catalan, French, Friulian, Occitan and Portuguese as a variant of the letter “c”...	U+00C7	C7	C7	C7	80	80	82
È È or can beThe letter E with a Grave accent.In Shakespeare's works, è would be used in the -ed suffix to indicate alternate pronunciation, for example with winged/wingèd, the è would be added to produce a pronunciation of instead of ....	U+00C8	C8	C8	C8		D4	E9
É É is a letter of the Czech, Hungarian, Icelandic, Kashubian, Luxembourgish, Slovak, and Catalan, Danish, English, French, Galician, Irish, Italian, Occitan, Norwegian, Portuguese, Spanish, Swedish, and Vietnamese language as a variant of the letter “e”...	U+00C9	C9	C9	C9	90	90	83
Ê Ê is a letter in the Friulan, Kurdish and Vietnamese languages. The letter also appears in Afrikaans, French, Portuguese, Welsh, and Albanian dialects as a variant of the letter "e", as well as being used in certain Chinese and Ukrainian transliteration systems.-Afrikaans:Ê is not considered a...	U+00CA	CA	CA	CA		D2	E6
Ë Ë is a letter in the Albanian, Ripuarian, Uyghur Latin Script, Ladin, and Kashubian languages. This letter also appears in Afrikaans, Dutch, French, Abruzzese dialect , and Luxembourgish language as a variant of letter "e"...	U+00CB	CB	CB	CB		D3	E8
Ì Ì Ì is used in the ISO 9:1995 system of Ukrainian transliteration as the Cyrillic letter І.In the Pinyin system of Chinese romanization, ì is an i with a falling tone.This appears in Catalan, Galician, Italian, Taos, and Vietnamese. Also Alcozauca Mixtec....	U+00CC	CC	CC	CC		DE	ED
Í Í is a letter in the Faroese, Hungarian, Icelandic, Czech, Slovak, and Tatar languages. This letter also appears in Catalan, Irish, Occitan, Portuguese, Spanish, Galician, Leonese, Navajo, and Vietnamese language as a variant of letter “i”....	U+00CD	CD	CD	CD		D6	EA
Î Î is a letter in the Friulian, Kurdish, and Romanian alphabets. This letter also appears in French, Welsh and Walon language as a variant of letter “i”.- Afrikaans :...	U+00CE	CE	CE	CE		D7	EB
Ï Ï ', lowercase ', is a symbol used in various languages written with the Latin alphabet and in Ukrainian language which is written with the Cyrillic based Ukrainian alphabet; it can be read as the letter I with diaeresis or I-umlaut....	U+00CF	CF	CF	CF		D8	EC
Character	Code point	ISO-8859-1	ISO-8859-15	WINDOWS-1252	IBM437	IBM850	MACINTOSH
Ð Ð A Latin capital letter D with a stroke through its vertical bar is the uppercase form of several different letters:*D with stroke , used in Vietnamese, some South Slavic , Moro and Sami languages...	U+00D0	D0	D0	D0		D1
Ñ Ñ Ñ is a letter of the modern Latin alphabet, formed by an N with a diacritical tilde. It is used in the Spanish alphabet, Galician alphabet, Asturian alphabet, Basque alphabet, Aragonese old alphabet , Filipino alphabet, Chamorro alphabet and the Guarani alphabet, where it represents...	U+00D1	D1	D1	D1	A5	A5	84
Ò Ò is a letter in the Kashubian language. This letter also appears in Catalan, Italian, Occitan, Scottish Gaelic, Taos, and Vietnamese language as a variant of letter “o”.-Character mappings:-External links:***...	U+00D2	D2	D2	D2		E3	F1
Ó Ó is a letter in the Faroese, Hungarian, Icelandic, Kashubian, Polish, Czech, Slovak, and Sorbian languages. This letter also appears in the Catalan, Irish, Occitan, Portuguese, Spanish, Italian and Vietnamese languages as a variant of letter “o”. It is also used in English for other purposes...	U+00D3	D3	D3	D3		E0	EE
Ô O O is the fifteenth letter and a vowel in the basic modern Latin alphabet.The letter was derived from the Semitic `Ayin , which represented a consonant, probably , the sound represented by the Arabic letter ع called `Ayn. This Semitic letter in its original form seems to have been inspired by a...	U+00D4	D4	D4	D4		E2	EF
Õ Õ "Õ", or "õ" is a composition of the Latin letter O with the diacritic mark tilde.The HTML entity is Õ for Õ and õ for õ.-Estonian:...	U+00D5	D5	D5	D5		E5	CD
Ö Ö "Ö", or "ö", is a character used in several extended Latin alphabets, or the letter O with umlaut to denote the front vowels or . In languages without umlaut, the character is also used as a "O with diaeresis" to denote a syllable break, wherein its pronunciation remains an unmodified .- O-Umlaut...	U+00D6	D6	D6	D6	99	99	85
× × The multiplication sign is the symbol ×. The symbol is similar to the lowercase letter x but is a more symmetric saltire, and has different uses. It is also known as St...	U+00D7	D7	D7	D7		9E
Ø Ø Ø — minuscule: "ø", is a vowel and a letter used in the Danish, Faroese, Norwegian and Southern Sami languages.It's mostly used as a representation of mid front rounded vowels, such as ø œ, except for Southern Sami where it's used as an [oe] diphtong.The name of this letter is the same as the sound...	U+00D8	D8	D8	D8		9D	AF
Ù U U is the twenty-first letter and a vowel in the basic modern Latin alphabet.-History:The letter U ultimately comes from the Semitic letter Waw by way of the letter Y. See the letter Y for details....	U+00D9	D9	D9	D9		EB	F4
Ú Ú Ú or ú is a Latin letter used in the Czech, Faroese, Hungarian, Icelandic, and Slovak writing systems. This letter also appears in Dutch, Irish, Occitan, Pinyin, Portuguese, Spanish, Italian, and Vietnamese as a variant of the letter "U"....	U+00DA	DA	DA	DA		E9	F2
Û Û is a letter of the French, Friulian, Kurdish, and Turkish alphabets. This letter was used in the ISO 9:1995 system of Cyrillic transliteration as the letter Ю and also in Wade-Giles for apical dental unrounded vowel as in tzû, tz'û, ssû, corresponds to present zi, ci, si in Pinyin respectively...	U+00DB	DB	DB	DB		EA	F3
Ü Ü Ü, or ü, is a character which can be either a letter from several extended Latin alphabets, or the letter U with an umlaut or a diaeresis...	U+00DC	DC	DC	DC	9A	9A	86
Ý Y Y is the twenty-fifth letter in the basic modern Latin alphabet and represents either a vowel or a consonant in English.-Name:In Latin, Y was named Y Graeca "Greek Y". This was pronounced as I Graeca "Greek I", since Latin speakers had trouble pronouncing , which was not a native sound...	U+00DD	DD	DD	DD		ED
Þ	U+00DE	DE	DE	DE		E8
ß ß In the German alphabet, ß is a letter that originated as a ligature of ss or sz. Like double "s", it is pronounced as an , but in standard spelling, it is only used after long vowels and diphthongs, while ss is used after short vowels...	U+00DF	DF	DF	DF	E1	E1	A7
Character	Code point	ISO-8859-1	ISO-8859-15	WINDOWS-1252	IBM437	IBM850	MACINTOSH
à À is a letter of the Catalan, French, Galician, Italian, Portuguese, Scottish Gaelic and Vietnamese languages, consisting of the Latin letter A and a grave accent. À is also used in Pinyin transliteration. In most languages, it represents the vowel a. This letter is also a letter in Taos.When...	U+00E0	E0	E0	E0	85	85	88
á Á is a letter of the Czech, Faroese, Hungarian, Icelandic, Slovak and Sámi languages. This letter also appears in Dutch, Galician, Irish, Occitan, Portuguese, Spanish, Lakota, Navajo, and Vietnamese as a variant of the letter “a”. Some writers use á incorrectly to denote a quantity, often used on...	U+00E1	E1	E1	E1	A0	A0	87
â Â is a letter of the Friulian, Romanian, Vietnamese, French, Galician, Portuguese, Frisian, Welsh, Turkish, and Walloon alphabets.- Croatian and Serbian :...	U+00E2	E2	E2	E2	83	83	89
ã Ã Ã/ã is a letter used in some languages, generally considered a variant of the letter A.In Portuguese, Ã/ã represents a nasal central unrounded vowel, . The combination ãe represents the Diphthong , and ão represents...	U+00E3	E3	E3	E3		C6	8B
ä Ä "Ä" and "ä" are both characters that represent either a letter from several extended Latin alphabets, or the letter A with an umlaut mark or diaeresis.- Independent letter :...	U+00E4	E4	E4	E4	84	84	8A
å Å Å represents various sounds in several languages. Å is part of the alphabets used for the Alemannic and the Bavarian-Austrian dialects of German...	U+00E5	E5	E5	E5	86	86	8C
æ Æ Æ is a grapheme formed from the letters a and e. Originally a ligature representing a Latin diphthong, it has been promoted to the full status of a letter in the alphabets of some languages, including Danish, Faroese, Norwegian and Icelandic...	U+00E6	E6	E6	E6	91	91	BE
ç Ç is a Latin script letter, used in the Albanian, Azerbaijani, Ligurian, Tatar, Turkish, Turkmen, Kurdish and Zazaki alphabets. This letter also appears in Catalan, French, Friulian, Occitan and Portuguese as a variant of the letter “c”...	U+00E7	E7	E7	E7	87	87	8D
è È or can beThe letter E with a Grave accent.In Shakespeare's works, è would be used in the -ed suffix to indicate alternate pronunciation, for example with winged/wingèd, the è would be added to produce a pronunciation of instead of ....	U+00E8	E8	E8	E8	8A	8A	8F
é É is a letter of the Czech, Hungarian, Icelandic, Kashubian, Luxembourgish, Slovak, and Catalan, Danish, English, French, Galician, Irish, Italian, Occitan, Norwegian, Portuguese, Spanish, Swedish, and Vietnamese language as a variant of the letter “e”...	U+00E9	E9	E9	E9	82	82	8E
ê Ê is a letter in the Friulan, Kurdish and Vietnamese languages. The letter also appears in Afrikaans, French, Portuguese, Welsh, and Albanian dialects as a variant of the letter "e", as well as being used in certain Chinese and Ukrainian transliteration systems.-Afrikaans:Ê is not considered a...	U+00EA	EA	EA	EA	88	88	90
ë Ë is a letter in the Albanian, Ripuarian, Uyghur Latin Script, Ladin, and Kashubian languages. This letter also appears in Afrikaans, Dutch, French, Abruzzese dialect , and Luxembourgish language as a variant of letter "e"...	U+00EB	EB	EB	EB	89	89	91
ì Ì Ì is used in the ISO 9:1995 system of Ukrainian transliteration as the Cyrillic letter І.In the Pinyin system of Chinese romanization, ì is an i with a falling tone.This appears in Catalan, Galician, Italian, Taos, and Vietnamese. Also Alcozauca Mixtec....	U+00EC	EC	EC	EC	8D	8D	93
í Í is a letter in the Faroese, Hungarian, Icelandic, Czech, Slovak, and Tatar languages. This letter also appears in Catalan, Irish, Occitan, Portuguese, Spanish, Galician, Leonese, Navajo, and Vietnamese language as a variant of letter “i”....	U+00ED	ED	ED	ED	A1	A1	92
î Î is a letter in the Friulian, Kurdish, and Romanian alphabets. This letter also appears in French, Welsh and Walon language as a variant of letter “i”.- Afrikaans :...	U+00EE	EE	EE	EE	8C	8C	94
ï Ï ', lowercase ', is a symbol used in various languages written with the Latin alphabet and in Ukrainian language which is written with the Cyrillic based Ukrainian alphabet; it can be read as the letter I with diaeresis or I-umlaut....	U+00EF	EF	EF	EF	8B	8B	95
Character	Code point	ISO-8859-1	ISO-8859-15	WINDOWS-1252	IBM437	IBM850	MACINTOSH
ð Ð A Latin capital letter D with a stroke through its vertical bar is the uppercase form of several different letters:*D with stroke , used in Vietnamese, some South Slavic , Moro and Sami languages...	U+00F0	F0	F0	F0		D0
ñ Ñ Ñ is a letter of the modern Latin alphabet, formed by an N with a diacritical tilde. It is used in the Spanish alphabet, Galician alphabet, Asturian alphabet, Basque alphabet, Aragonese old alphabet , Filipino alphabet, Chamorro alphabet and the Guarani alphabet, where it represents...	U+00F1	F1	F1	F1	A4	A4	96
ò Ò is a letter in the Kashubian language. This letter also appears in Catalan, Italian, Occitan, Scottish Gaelic, Taos, and Vietnamese language as a variant of letter “o”.-Character mappings:-External links:***...	U+00F2	F2	F2	F2	95	95	98
ó Ó is a letter in the Faroese, Hungarian, Icelandic, Kashubian, Polish, Czech, Slovak, and Sorbian languages. This letter also appears in the Catalan, Irish, Occitan, Portuguese, Spanish, Italian and Vietnamese languages as a variant of letter “o”. It is also used in English for other purposes...	U+00F3	F3	F3	F3	A2	A2	97
ô O O is the fifteenth letter and a vowel in the basic modern Latin alphabet.The letter was derived from the Semitic `Ayin , which represented a consonant, probably , the sound represented by the Arabic letter ع called `Ayn. This Semitic letter in its original form seems to have been inspired by a...	U+00F4	F4	F4	F4	93	93	99
õ Õ "Õ", or "õ" is a composition of the Latin letter O with the diacritic mark tilde.The HTML entity is Õ for Õ and õ for õ.-Estonian:...	U+00F5	F5	F5	F5		E4	9B
ö Ö "Ö", or "ö", is a character used in several extended Latin alphabets, or the letter O with umlaut to denote the front vowels or . In languages without umlaut, the character is also used as a "O with diaeresis" to denote a syllable break, wherein its pronunciation remains an unmodified .- O-Umlaut...	U+00F6	F6	F6	F6	94	94	9A
÷	U+00F7	F7	F7	F7	F6	F6	D6
ø Ø Ø — minuscule: "ø", is a vowel and a letter used in the Danish, Faroese, Norwegian and Southern Sami languages.It's mostly used as a representation of mid front rounded vowels, such as ø œ, except for Southern Sami where it's used as an [oe] diphtong.The name of this letter is the same as the sound...	U+00F8	F8	F8	F8		9B	BF
ù U U is the twenty-first letter and a vowel in the basic modern Latin alphabet.-History:The letter U ultimately comes from the Semitic letter Waw by way of the letter Y. See the letter Y for details....	U+00F9	F9	F9	F9	97	97	9D
ú Ú Ú or ú is a Latin letter used in the Czech, Faroese, Hungarian, Icelandic, and Slovak writing systems. This letter also appears in Dutch, Irish, Occitan, Pinyin, Portuguese, Spanish, Italian, and Vietnamese as a variant of the letter "U"....	U+00FA	FA	FA	FA	A3	A3	9C
û Û is a letter of the French, Friulian, Kurdish, and Turkish alphabets. This letter was used in the ISO 9:1995 system of Cyrillic transliteration as the letter Ю and also in Wade-Giles for apical dental unrounded vowel as in tzû, tz'û, ssû, corresponds to present zi, ci, si in Pinyin respectively...	U+00FB	FB	FB	FB	96	96	9E
ü Ü Ü, or ü, is a character which can be either a letter from several extended Latin alphabets, or the letter U with an umlaut or a diaeresis...	U+00FC	FC	FC	FC	81	81	9F
ý Y Y is the twenty-fifth letter in the basic modern Latin alphabet and represents either a vowel or a consonant in English.-Name:In Latin, Y was named Y Graeca "Greek Y". This was pronounced as I Graeca "Greek I", since Latin speakers had trouble pronouncing , which was not a native sound...	U+00FD	FD	FD	FD		EC
þ	U+00FE	FE	FE	FE		E7
ÿ Y Y is the twenty-fifth letter in the basic modern Latin alphabet and represents either a vowel or a consonant in English.-Name:In Latin, Y was named Y Graeca "Greek Y". This was pronounced as I Graeca "Greek I", since Latin speakers had trouble pronouncing , which was not a native sound...	U+00FF	FF	FF	FF	98	98	D8
Character	Code point	ISO-8859-1	ISO-8859-15	WINDOWS-1252	IBM437	IBM850	MACINTOSH
ı	U+0131					D5	F5
Œ Œ Œ œŒ is a Latin alphabet grapheme, a ligature of o and e. In medieval and early modern Latin, it was used to represent the Greek diphthong οι, a usage which continues in English and French...	U+0152		BC	8C			CE
œ Œ Œ œŒ is a Latin alphabet grapheme, a ligature of o and e. In medieval and early modern Latin, it was used to represent the Greek diphthong οι, a usage which continues in English and French...	U+0153		BD	9C			CF
Š Š The grapheme Š, š is used in various contexts, usually denoting the voiceless postalveolar fricative. In the International Phonetic Alphabet this sound is denoted with , but the lowercase š is used in the Americanist phonetic notation, as well as in the Uralic Phonetic Alphabet.For use in computer...	U+0160		A6	8A
š Š The grapheme Š, š is used in various contexts, usually denoting the voiceless postalveolar fricative. In the International Phonetic Alphabet this sound is denoted with , but the lowercase š is used in the Americanist phonetic notation, as well as in the Uralic Phonetic Alphabet.For use in computer...	U+0161		A8	9A
Ÿ Y Y is the twenty-fifth letter in the basic modern Latin alphabet and represents either a vowel or a consonant in English.-Name:In Latin, Y was named Y Graeca "Greek Y". This was pronounced as I Graeca "Greek I", since Latin speakers had trouble pronouncing , which was not a native sound...	U+0178		BE	9F			D9
Ž Ž The grapheme Ž is formed from Latin Z with the addition of caron . It is used in various contexts, usually denoting the voiced postalveolar fricative, a sound similar to English g in mirage, or Portuguese and French j...	U+017D		B4	8E
ž Ž The grapheme Ž is formed from Latin Z with the addition of caron . It is used in various contexts, usually denoting the voiced postalveolar fricative, a sound similar to English g in mirage, or Portuguese and French j...	U+017E		B8	9E
ƒ ƒ The letter ' is a letter of the Latin alphabet, based on the italic form of f; or on its regular form with a descender hook added...	U+0192			83	9F	9F	C4
ˆ	U+02C6			88			F6
ˇ	U+02C7						FF
˘	U+02D8						F9
˙	U+02D9						FA
˚	U+02DA						FB
˛	U+02DB						FE
˜	U+02DC			98			F7
Character	Code point	ISO-8859-1	ISO-8859-15	WINDOWS-1252	IBM437	IBM850	MACINTOSH
˝	U+02DD						FD
Γ	U+0393				E2
Θ	U+0398				E9
Σ	U+03A3				E4
Φ	U+03A6				E8
Ω	U+03A9				EA		BD
α	U+03B1				E0
δ	U+03B4				EB
ε	U+03B5				EE
π	U+03C0				E3		B9
σ	U+03C3				E5
τ	U+03C4				E7
φ	U+03C6				ED
–	U+2013			96			D0
—	U+2014			97			D1
‗	U+2017					F2
Character	Code point	ISO-8859-1	ISO-8859-15	WINDOWS-1252	IBM437	IBM850	MACINTOSH
‘	U+2018			91			D4
’	U+2019			92			D5
‚	U+201A			82			E2
“	U+201C			93			D2
”	U+201D			94			D3
„	U+201E			84			E3
†	U+2020			86			A0
‡	U+2021			87			E0
•	U+2022			95			A5
…	U+2026			85			C9
‰	U+2030			89			E4
‹	U+2039			8B			DC
›	U+203A			9B			DD
⁄	U+2044						DA
ⁿ	U+207F				FC
₧	U+20A7				9E
Character	Code point	ISO-8859-1	ISO-8859-15	WINDOWS-1252	IBM437	IBM850	MACINTOSH
€ Euro sign The euro sign is the currency sign used for the euro, the official currency of the Eurozone in the European Union . The design was presented to the public by the European Commission on 12 December 1996. The international three-letter code for the euro is EUR...	U+20AC		A4	80			DB
™	U+2122			99			AA
∂	U+2202						B6
∆ Delta (letter) Delta is the fourth letter of the Greek alphabet. In the system of Greek numerals it has a value of 4. It was derived from the Phoenician letter Dalet...	U+2206						C6
∏	U+220F						B8
∑	U+2211						B7
∙	U+2219				F9
√	U+221A				FB		C3
∞	U+221E				EC		B0
∩	U+2229				EF
∫	U+222B						BA
≈	U+2248				F7		C5
≠	U+2260						AD
≡ Triple bar The triple bar, ≡, is a symbol used in formal logic. It has the appearance of a "=" sign with a third line.Logically, it has a similar meaning to the if and only if coupler ⇔...	U+2261				F0
≤	U+2264				F3		B2
≥	U+2265				F2		B3
Character	Code point	ISO-8859-1	ISO-8859-15	WINDOWS-1252	IBM437	IBM850	MACINTOSH
⌐	U+2310				A9
⌠	U+2320				F4
⌡	U+2321				F5
─	U+2500				C4	C4
│	U+2502				B3	B3
┌	U+250C				DA	DA
┐	U+2510				BF	BF
└	U+2514				C0	C0
┘	U+2518				D9	D9
├	U+251C				C3	C3
┤	U+2524				B4	B4
┬	U+252C				C2	C2
┴	U+2534				C1	C1
┼	U+253C				C5	C5
═	U+2550				CD	CD
║	U+2551				BA	BA
Character	Code point	ISO-8859-1	ISO-8859-15	WINDOWS-1252	IBM437	IBM850	MACINTOSH
╒	U+2552				D5
╓	U+2553				D6
╔	U+2554				C9	C9
╕	U+2555				B8
╖	U+2556				B7
╗	U+2557				BB	BB
╘	U+2558				D4
╙	U+2559				D3
╚	U+255A				C8	C8
╛	U+255B				BE
╜	U+255C				BD
╝	U+255D				BC	BC
╞	U+255E				C6
╟	U+255F				C7
╠	U+2560				CC	CC
╡	U+2561				B5
Character	Code point	ISO-8859-1	ISO-8859-15	WINDOWS-1252	IBM437	IBM850	MACINTOSH
╢	U+2562				B6
╣	U+2563				B9	B9
╤	U+2564				D1
╥	U+2565				D2
╦	U+2566				CB	CB
╧	U+2567				CF
╨	U+2568				D0
╩	U+2569				CA	CA
╪	U+256A				D8
╫	U+256B				D7
╬	U+256C				CE	CE
▀	U+2580				DF	DF
▄	U+2584				DC	DC
█	U+2588				DB	DB
▌	U+258C				DD
▐	U+2590				DE
Character	Code point	ISO-8859-1	ISO-8859-15	WINDOWS-1252	IBM437	IBM850	MACINTOSH
░	U+2591				B0	B0
▒	U+2592				B1	B1
▓	U+2593				B2	B2
■	U+25A0				FE	FE
◊	U+25CA						D7
	U+F8FF						F0
ﬁ	U+FB01						DE
ﬂ	U+FB02						DF

The source of this article is wikipedia, the free encyclopedia. The text of this article is licensed under the GFDL.