Windows-1251
Encyclopedia
Windows-1251 is a popular 8-bit character encoding
, designed to cover languages that use the Cyrillic alphabet such as Russian
, Bulgarian
, Serbian Cyrillic
and other languages. It is the most widely used for encoding the Bulgarian
, Serbian
and Macedonian
languages.
In modern applications, Unicode
is a preferred character set.
Windows-1251 and KOI8-R
(or its Ukrainian
variant KOI8-U
) are much more commonly used than ISO 8859-5, which never really caught on. In the future, both may eventually give way to Unicode
.
equivalent.
In the table above, 20 is the regular SPACE character, A0 is the NO-BREAK SPACE, and AD is SOFT HYPHEN.
Character encoding
A character encoding system consists of a code that pairs each character from a given repertoire with something else, such as a sequence of natural numbers, octets or electrical pulses, in order to facilitate the transmission of data through telecommunication networks or storage of text in...
, designed to cover languages that use the Cyrillic alphabet such as Russian
Russian language
Russian is a Slavic language used primarily in Russia, Belarus, Uzbekistan, Kazakhstan, Tajikistan and Kyrgyzstan. It is an unofficial but widely spoken language in Ukraine, Moldova, Latvia, Turkmenistan and Estonia and, to a lesser extent, the other countries that were once constituent republics...
, Bulgarian
Bulgarian language
Bulgarian is an Indo-European language, a member of the Slavic linguistic group.Bulgarian, along with the closely related Macedonian language, demonstrates several linguistic characteristics that set it apart from all other Slavic languages such as the elimination of case declension, the...
, Serbian Cyrillic
Serbian Cyrillic alphabet
The Serbian Cyrillic alphabet is an adaptation of the Cyrillic script for the Serbian language, developed in 1818 by Serbian linguist Vuk Karadžić. It is one of the two standard modern alphabets used to write the Serbian language, the other being Latin...
and other languages. It is the most widely used for encoding the Bulgarian
Bulgarian language
Bulgarian is an Indo-European language, a member of the Slavic linguistic group.Bulgarian, along with the closely related Macedonian language, demonstrates several linguistic characteristics that set it apart from all other Slavic languages such as the elimination of case declension, the...
, Serbian
Serbian language
Serbian is a form of Serbo-Croatian, a South Slavic language, spoken by Serbs in Serbia, Bosnia and Herzegovina, Montenegro, Croatia and neighbouring countries....
and Macedonian
Macedonian language
Macedonian is a South Slavic language spoken as a first language by approximately 2–3 million people principally in the region of Macedonia but also in the Macedonian diaspora...
languages.
In modern applications, Unicode
Unicode
Unicode is a computing industry standard for the consistent encoding, representation and handling of text expressed in most of the world's writing systems...
is a preferred character set.
Windows-1251 and KOI8-R
KOI8-R
KOI8-R is an 8-bit character encoding, designed to cover Russian, which uses the Cyrillic alphabet. It also happens to cover Bulgarian, but is not used since CP1251 is accepted. A derivative encoding is KOI8-U, which adds Ukrainian characters...
(or its Ukrainian
Ukrainian language
Ukrainian is a language of the East Slavic subgroup of the Slavic languages. It is the official state language of Ukraine. Written Ukrainian uses a variant of the Cyrillic alphabet....
variant KOI8-U
KOI8-U
KOI8-U is an 8-bit character encoding, designed to cover Ukrainian, which uses the Cyrillic alphabet. It is based on KOI8-R, which covers Russian and Bulgarian, but replaces eight graphic characters with four Ukrainian letters Ґ, Є, І, and Ї in both upper case and lower case.In Microsoft Windows,...
) are much more commonly used than ISO 8859-5, which never really caught on. In the future, both may eventually give way to Unicode
Unicode
Unicode is a computing industry standard for the consistent encoding, representation and handling of text expressed in most of the world's writing systems...
.
Codepage layout
The following table shows Windows-1251. Each character is shown with its decimal code and its UnicodeUnicode
Unicode is a computing industry standard for the consistent encoding, representation and handling of text expressed in most of the world's writing systems...
equivalent.
Windows-1251 | ||||||||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
—0 | —1 | —2 | —3 | —4 | —5 | —6 | —7 | —8 | —9 | —A | —B | —C | —D | —E | —F | |
In the table above, 20 is the regular SPACE character, A0 is the NO-BREAK SPACE, and AD is SOFT HYPHEN.
External links
- Windows 1251 reference chart
- IANA Charset Name Registration
- Unicode mapping table for Windows 1251
- Unicode mappings of windows 1251 with "best fit"
- Universal Cyrillic decoder, an online program that may help recovering unreadable Cyrillic textsCyrillic alphabetThe Cyrillic script or azbuka is an alphabetic writing system developed in the First Bulgarian Empire during the 10th century AD at the Preslav Literary School...
with broken Windows-1251 or other character encodingCharacter encodingA character encoding system consists of a code that pairs each character from a given repertoire with something else, such as a sequence of natural numbers, octets or electrical pulses, in order to facilitate the transmission of data through telecommunication networks or storage of text in...
s.