Windows Glyph List 4
Encyclopedia
Windows Glyph List 4, or more commonly WGL4 for short, also known as the Pan-European character set, is a character
repertoire on recent Microsoft operating systems comprising 652 Unicode
characters. Its purpose is to provide an implementation guideline for producers of fonts for the representation of Europe
an natural language
s; fonts that provide glyph
s for the entire set of characters can claim WGL4 compliance and thus can expect to be compatible with a wide range of software.
, WGL4 characters are the only ones guaranteed to display correctly on recent versions of all major platforms of Microsoft Windows. Some Unicode characters may be missing from default installations of Windows 9x (such as Armenian and Georgian) or of Windows XP (such as Ethiopian and Runic), but WGL4 glyphs are found on all major platforms of Microsoft Windows.
Because many fonts are designed to fulfill the WGL4 set, this set of characters is likely to work (display as other than replacement glyphs) on many computer systems. For instance your browser is probably able to draw all the characters in the table below, compared to the many missing characters you may see in other articles about Unicode.
, encompasses all the characters found in Microsoft’s code page
s 1252
(Windows Western), 1250
(Windows Central European), 1251
(Windows Cyrillic), 1253
(Windows Greek), 1254
(Windows Turkish), and 1257
(Windows Baltic), as well as characters from MS-DOS
codepage 437.
It does not cover the combining diacritics used by code page 1258, the Thai letters used in code page 874, Hebrew and Arabic letters covered by code pages 1255
and 1256
, or the ideographic characters used by code pages 932
, 936
, 949
and 950
.
It also does not cover the Romanian letters Ș
, ș
, Ț
, and ț
(U+0218–B), which were added to several of Microsoft’s fonts for Windows Vista
(long after the WGL4 repertoire was originally defined).
In version 1.5 of the OpenType Specification (May 2008) four Cyrillic characters were added to the WGL4 character set: Ѐ (U+0400), Ѝ (U+040D), ѐ (U+0450) and ѝ (U+045D).
~
C1 Controls and Latin-1 Supplement
(identical to ISO/IEC 8859-1
)
00A0
¡
¢
£
¤
¥
¦
§
¨
©
ª
«
¬
-
®
¯
00B0
°
±
²
³
´
µ
¶
·
¸
¹
º
»
¼
½
¾
¿
00C0
À
Á
Â
Ã
Ä
Å
Æ
Ç
È
É
Ê
Ë
Ì
Í
Î
Ï
00D0
Ð
Ñ
Ò
Ó
Ô
Õ
Ö
×
Ø
Ù
Ú
Û
Ü
Ý
Þ
ß
00E0
à
á
â
ã
ä
å
æ
ç
è
é
ê
ë
ì
í
î
ï
00F0
ð
ñ
ò
ó
ô
õ
ö
÷
ø
ù
ú
û
ü
ý
þ
ÿ
0100
Ā
ā
Ă
ă
Ą
ą
Ć
ć
Ĉ
ĉ
Ċ
ċ
Č
č
Ď
ď
Latin Extended-A
0110
Đ
đ
Ē
ē
Ĕ
ĕ
Ė
ė
Ę
ę
Ě
ě
Ĝ
ĝ
Ğ
ğ
0120
Ġ
ġ
Ģ
ģ
Ĥ
ĥ
Ħ
ħ
Ĩ
ĩ
Ī
ī
Ĭ
ĭ
Į
į
0130
İ
ı
IJ
ij
Ĵ
ĵ
Ķ
ķ
ĸ
Ĺ
ĺ
Ļ
ļ
Ľ
ľ
Ŀ
0140
ŀ
Ł
ł
Ń
ń
Ņ
ņ
Ň
ň
ʼn
Ŋ
ŋ
Ō
ō
Ŏ
ŏ
0150
Ő
ő
Œ
œ
Ŕ
ŕ
Ŗ
ŗ
Ř
ř
Ś
ś
Ŝ
ŝ
Ş
ş
0160
Š
š
Ţ
ţ
Ť
ť
Ŧ
ŧ
Ũ
ũ
Ū
ū
Ŭ
ŭ
Ů
ů
0170
Ű
ű
Ų
ų
Ŵ
ŵ
Ŷ
ŷ
Ÿ
Ź
ź
Ż
ż
Ž
ž
ſ
Latin Extended-B
0190
ƒ
01F0
Ǻ
ǻ
Ǽ
ǽ
Ǿ
ǿ
02C0
ˆ
ˇ
ˉ
Spacing Modifier Letters
02D0
˘
˙
˚
˛
˜
˝
0370
;
Greek
0380
΄
΅
Ά
·
Έ
Ή
Ί
Ό
Ύ
Ώ
0390
ΐ
Α
Β
Γ
Δ
Ε
Ζ
Η
Θ
Ι
Κ
Λ
Μ
Ν
Ξ
Ο
03A0
Π
Ρ
Σ
Τ
Υ
Φ
Χ
Ψ
Ω
Ϊ
Ϋ
ά
έ
ή
ί
03B0
ΰ
α
β
γ
δ
ε
ζ
η
θ
ι
κ
λ
μ
ν
ξ
ο
03C0
π
ρ
ς
σ
τ
υ
φ
χ
ψ
ω
ϊ
ϋ
ό
ύ
ώ
0400
Ѐ
Ё
Ђ
Ѓ
Є
Ѕ
І
Ї
Ј
Љ
Њ
Ћ
Ќ
Ѝ
Ў
Џ
Cyrillic
0410
А
Б
В
Г
Д
Е
Ж
З
И
Й
К
Л
М
Н
О
П
0420
Р
С
Т
У
Ф
Х
Ц
Ч
Ш
Щ
Ъ
Ы
Ь
Э
Ю
Я
0430
а
б
в
г
д
е
ж
з
и
й
к
л
м
н
о
п
0440
р
с
т
у
ф
х
ц
ч
ш
щ
ъ
ы
ь
э
ю
я
0450
ѐ
ё
ђ
ѓ
є
ѕ
і
ї
ј
љ
њ
ћ
ќ
ѝ
ў
џ
0490
Ґ
ґ
1E80
Ẁ
ẁ
Ẃ
ẃ
Ẅ
ẅ
Latin Extended Additional
1EF0
Ỳ
ỳ
2010
–
—
―
‗
‘
’
‚
‛
“
”
„
General Punctuation
2020
†
‡
•
…
2030
‰
′
″
‹
›
‼
‾
2040
⁄
2070
ⁿ
Super/Subscripts
20A0
₣
₤
₧
€
Currency Symbols
2100
℅
Letterlike symbols
2110
ℓ
№
2120
™
Ω
℮
2150
⅛
⅜
⅝
⅞
Number Forms
2190
←
↑
→
↓
↔
↕
Arrows
21A0
↨
2200
∂
∆
∏
Mathematical Operators
2210
∑
−
∕
∙
√
∞
∟
2220
∩
∫
2240
≈
2260
≠
≡
≤
≥
2300
⌂
Miscellaneous Technical
2310
⌐
2320
⌠
⌡
2500
─
│
┌
Box drawing characters
2510
┐
└
┘
├
2520
┤
┬
2530
┴
┼
2550
═
║
╒
╓
╔
╕
╖
╗
╘
╙
╚
╛
╜
╝
╞
╟
2560
╠
╡
╢
╣
╤
╥
╦
╧
╨
╩
╪
╫
╬
2580
▀
▄
█
▌
Block Elements
2590
▐
░
▒
▓
25A0
■
□
▪
▫
▬
Geometric Shapes
25B0
▲
►
▼
25C0
◄
◊
○
●
25D0
◘
◙
25E0
◦
Miscellaneous Symbols
2630
☺
☻
☼
2640
♀
♂
2660
♠
♣
♥
♦
♪
♫
F000
Private Use Area
FB00
fi
fl
Alphabetic Presentation Forms
Legend
Character (computing)
In computer and machine-based telecommunications terminology, a character is a unit of information that roughly corresponds to a grapheme, grapheme-like unit, or symbol, such as in an alphabet or syllabary in the written form of a natural language....
repertoire on recent Microsoft operating systems comprising 652 Unicode
Unicode
Unicode is a computing industry standard for the consistent encoding, representation and handling of text expressed in most of the world's writing systems...
characters. Its purpose is to provide an implementation guideline for producers of fonts for the representation of Europe
Europe
Europe is, by convention, one of the world's seven continents. Comprising the westernmost peninsula of Eurasia, Europe is generally 'divided' from Asia to its east by the watershed divides of the Ural and Caucasus Mountains, the Ural River, the Caspian and Black Seas, and the waterways connecting...
an natural language
Natural language
In the philosophy of language, a natural language is any language which arises in an unpremeditated fashion as the result of the innate facility for language possessed by the human intellect. A natural language is typically used for communication, and may be spoken, signed, or written...
s; fonts that provide glyph
Glyph
A glyph is an element of writing: an individual mark on a written medium that contributes to the meaning of what is written. A glyph is made up of one or more graphemes....
s for the entire set of characters can claim WGL4 compliance and thus can expect to be compatible with a wide range of software.
, WGL4 characters are the only ones guaranteed to display correctly on recent versions of all major platforms of Microsoft Windows. Some Unicode characters may be missing from default installations of Windows 9x (such as Armenian and Georgian) or of Windows XP (such as Ethiopian and Runic), but WGL4 glyphs are found on all major platforms of Microsoft Windows.
Because many fonts are designed to fulfill the WGL4 set, this set of characters is likely to work (display as other than replacement glyphs) on many computer systems. For instance your browser is probably able to draw all the characters in the table below, compared to the many missing characters you may see in other articles about Unicode.
Repertoire
The repertoire, defined by MicrosoftMicrosoft
Microsoft Corporation is an American public multinational corporation headquartered in Redmond, Washington, USA that develops, manufactures, licenses, and supports a wide range of products and services predominantly related to computing through its various product divisions...
, encompasses all the characters found in Microsoft’s code page
Code page
Code page is another term for character encoding. It consists of a table of values that describes the character set for a particular language. The term code page originated from IBM's EBCDIC-based mainframe systems, but many vendors use this term including Microsoft, SAP, and Oracle Corporation...
s 1252
Windows-1252
Windows-1252 or CP-1252 is a character encoding of the Latin alphabet, used by default in the legacy components of Microsoft Windows in English and some other Western languages. It is one version within the group of Windows code pages...
(Windows Western), 1250
Windows-1250
Windows-1250 is a code page used under Microsoft Windows to represent texts in Central European and Eastern European languages that use Latin script, such as Polish, Czech, Slovak, Hungarian, Slovene, Bosnian, Croatian, Serbian , Romanian and Albanian...
(Windows Central European), 1251
Windows-1251
Windows-1251 is a popular 8-bit character encoding, designed to cover languages that use the Cyrillic alphabet such as Russian, Bulgarian, Serbian Cyrillic and other languages...
(Windows Cyrillic), 1253
Windows-1253
Windows-1253 is a Windows code page used to write modern Greek. It is not capable of supporting the older polytonic Greek. It is not fully compatible with ISO 8859-7 because the letters like Ά are located at different byte values....
(Windows Greek), 1254
Windows-1254
Windows-1254 is a code page used under Microsoft Windows to write Turkish. Characters with codepoints A0 through FF are compatible with ISO 8859-9.Unicode is preferred to windows 1254 for modern applications- Code page layout :...
(Windows Turkish), and 1257
Windows-1257
Windows-1257 is a single byte code page used to support the Estonian, Latvian and Lithuanian languages under Microsoft Windows. This code page is similar in layout to ISO 8859-13, but they differ in codepoints A1, A5, B4, FF, and of course in the range 80–9F, which is typically allocated with...
(Windows Baltic), as well as characters from MS-DOS
MS-DOS
MS-DOS is an operating system for x86-based personal computers. It was the most commonly used member of the DOS family of operating systems, and was the main operating system for IBM PC compatible personal computers during the 1980s to the mid 1990s, until it was gradually superseded by operating...
codepage 437.
It does not cover the combining diacritics used by code page 1258, the Thai letters used in code page 874, Hebrew and Arabic letters covered by code pages 1255
Windows-1255
Windows-1255 is a codepage used under Microsoft Windows to write Hebrew. It is an almost compatible superset of ISO 8859-8 — the symbols are in the same positions Windows-1255 is a codepage used under Microsoft Windows to write Hebrew. It is an almost compatible superset of ISO 8859-8 — the symbols...
and 1256
Windows-1256
Windows-1256 is a code page used to write Arabic under Microsoft Windows. This code page is not compatible with ISO 8859-6 and MacArabic encodings....
, or the ideographic characters used by code pages 932
Code page 932
Code page 932 is Microsoft's extension of Shift JIS to include NEC special characters , NEC selection of IBM extensions , and IBM extensions . The coded character sets are JIS X0201:1997, JIS X0208:1997, and these extensions...
, 936
GBK
GBK is an extension of the GB2312 character set for simplified Chinese characters, used in the People's Republic of China.GB abbreviates Guojia Biaozhun , which means national standard in Chinese, while K stands for Extension...
, 949
Code page 949
Code page 949 is Microsoft's implementation that appears similar to EUC-KR. This code page supports the Korean language. The code page is not registered with IANA, and hence, is not a standard to communicate information over the Internet, although it's often used for that. UTF-8 is much preferred...
and 950
Code page 950
Code page 950 is Microsoft's implementation of the de facto standard Big5. The code page is not registered with IANA, and hence, is not a standard to communicate information over the internet. The major difference between code page 950 and Big5 is the incorporation of some ETEN characters at...
.
It also does not cover the Romanian letters Ș
S
S is the nineteenth letter in the ISO basic Latin alphabet.-History: Semitic Šîn represented a voiceless postalveolar fricative . Greek did not have this sound, so the Greek sigma came to represent...
, ș
S
S is the nineteenth letter in the ISO basic Latin alphabet.-History: Semitic Šîn represented a voiceless postalveolar fricative . Greek did not have this sound, so the Greek sigma came to represent...
, Ț
T
T is the 20th letter in the basic modern Latin alphabet. It is the most commonly used consonant and the second most common letter in the English language.- History :Taw was the last letter of the Western Semitic and Hebrew alphabets...
, and ț
T
T is the 20th letter in the basic modern Latin alphabet. It is the most commonly used consonant and the second most common letter in the English language.- History :Taw was the last letter of the Western Semitic and Hebrew alphabets...
(U+0218–B), which were added to several of Microsoft’s fonts for Windows Vista
Windows Vista
Windows Vista is an operating system released in several variations developed by Microsoft for use on personal computers, including home and business desktops, laptops, tablet PCs, and media center PCs...
(long after the WGL4 repertoire was originally defined).
In version 1.5 of the OpenType Specification (May 2008) four Cyrillic characters were added to the WGL4 character set: Ѐ (U+0400), Ѝ (U+040D), ѐ (U+0450) and ѝ (U+045D).
Character table
U+ | 0 | 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9 | A | B | C | D | E | F | Block |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
0020 | ! | " | # | $ | % | & | ' | ( | ) | * | , | |||||
. | / | C0 Controls and Basic Latin (identical to ASCII ASCII The American Standard Code for Information Interchange is a character-encoding scheme based on the ordering of the English alphabet. ASCII codes represent text in computers, communications equipment, and other devices that use text... ) |
||||||||||||||
0030 | 0 | 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9 | : | ; | < | = | > | ? |
0040 | @ | A | B | C | D | E | F | G | H | I | J | K | L | M | N | O |
0050 | P | Q | R | S | T | U | V | W | X | Y | Z | [ | \ | ] | ^ | _ |
0060 | ` | a | b | c | d | e | f | g | h | i | j | k | l | m | n | o |
0070 | p | q | r | s | t | u | v | w | x | y | z | { | | |
(identical to ISO/IEC 8859-1
ISO/IEC 8859-1
ISO/IEC 8859-1:1998, Information technology — 8-bit single-byte coded graphic character sets — Part 1: Latin alphabet No. 1, is part of the ISO/IEC 8859 series of ASCII-based standard character encodings, first edition published in 1987. It is informally referred to as Latin-1. It is generally...
)
Greek alphabet
The Greek alphabet is the script that has been used to write the Greek language since at least 730 BC . The alphabet in its classical and modern form consists of 24 letters ordered in sequence from alpha to omega...
Unicode subscripts and superscripts
Unicode has subscripted and superscripted versions of a number of characters including a full set of arabic numerals. These characters allow any polynomial, chemical and certain other equations to be represented in plain text without using any form of markup like HTML or TeX.The World Wide Web...
Number Forms
Number Forms are Unicode characters which have specific meaning as numbers, but are constructed from other characters. They consist primarily of vulgar fractions and roman numerals. They are placed in the Unicode codepoint range 0x2150 through 0x218F , except for three fractions in ISO-8859-1...
Arrow (symbol)
An arrow is a graphical symbol such as → or ←, used to point or indicate direction, being in its simplest form a line segment with a triangle affixed to one end, and in more complex forms a representation of an actual arrow...
Box drawing characters
Box drawing characters, also known as line drawing characters, or pseudographics, are widely used in text user interfaces to draw various frames and boxes...
Box drawing characters
Box drawing characters, also known as line drawing characters, or pseudographics, are widely used in text user interfaces to draw various frames and boxes...
Miscellaneous Symbols
The Miscellaneous Symbols Unicode block contains various glyphs representing things from a variety of categories: Astrological, Astronomical, Chess, Dice, Ideological symbols, Musical notation, Political symbols, Recycling, Religious symbols, Trigrams, Warning signs and Weather.-Tables:Note: These...
Legend
See also
- Adobe Glyph ListAdobe Glyph ListThe Adobe Glyph List is a mapping of 4,281 glyph names to one or more Unicode characters. Its purpose is to provide an implementation guideline for consumers of fonts ; it lists a variety of standard names that are given to glyphs that correspond to certain Unicode character sequences...
- Linotype Extended European Characterset (LEEC)
- World Glyph SetWorld Glyph SetThe world glyph sets are character repertoires comprising a subset of Unicode characters. Their purpose is to provide an implementation guideline for producers of fonts for the representation of natural languages. Unlike Windows Glyph List 4 it is specified by font foundries and not by operating...
(W1G) - Multilingual European Subsets MES-1 and MES-2
External links
- WGL4.0 Character Set on Microsoft Typography