KS X 1001
Encyclopedia
KS X 1001 is a South Korean coded character set standard to represent hangul
and hanja
characters on a computer. It is arranged as 94
×94 table (similarly to 2-byte code words in ISO 2022 and EUC
), therefore its code point
s are pairs
of integers 1–94. KS X 1001 contains ASCII
, Korean, typography
, Greek
, Cyrillic
, Japanese (Hiragana
and Katakana
) and some other characters.
This standard was previously known as KS C 5601. There have been several revisions of this standard. For example, there were revisions in 1987, 1992, 1998 and 2002.
Several computer operating systems encode various versions of this standard several ways. Not all of them encode the standard the same way, like replacing the typical backslash
at byte 0x
5C with the won
currency sign (₩).
Some operating systems extend this standard in other non-uniform ways. Possible encoding schemes of KS X 1001 are: EUC-KR, windows-949 (superset of EUC-KR), ISO-2022-KR and JOHAB. However, the latter two encodings are rarely used.
Hangul
Hangul,Pronounced or ; Korean: 한글 Hangeul/Han'gŭl or 조선글 Chosŏn'gŭl/Joseongeul the Korean alphabet, is the native alphabet of the Korean language. It is a separate script from Hanja, the logographic Chinese characters which are also sometimes used to write Korean...
and hanja
Hanja
Hanja is the Korean name for the Chinese characters hanzi. More specifically, it refers to those Chinese characters borrowed from Chinese and incorporated into the Korean language with Korean pronunciation...
characters on a computer. It is arranged as 94
94 (number)
94 is the natural number following 93 and preceding 95.-In mathematics:Ninety-four is the twenty-ninth distinct semiprime and the fourteenth of the form...
×94 table (similarly to 2-byte code words in ISO 2022 and EUC
Extended Unix Code
Extended Unix Code is a multibyte character encoding system used primarily for Japanese, Korean, and simplified Chinese.The structure of EUC is based on the ISO-2022 standard, which specifies a way to represent character sets containing a maximum of 94 characters, or 8836 characters, or 830584 ...
), therefore its code point
Code point
In character encoding terminology, a code point or code position is any of the numerical values that make up the code space . For example, ASCII comprises 128 code points in the range 0hex to 7Fhex, Extended ASCII comprises 256 code points in the range 0hex to FFhex, and Unicode comprises 1,114,112...
s are pairs
Ordered pair
In mathematics, an ordered pair is a pair of mathematical objects. In the ordered pair , the object a is called the first entry, and the object b the second entry of the pair...
of integers 1–94. KS X 1001 contains ASCII
ASCII
The American Standard Code for Information Interchange is a character-encoding scheme based on the ordering of the English alphabet. ASCII codes represent text in computers, communications equipment, and other devices that use text...
, Korean, typography
Typography
Typography is the art and technique of arranging type in order to make language visible. The arrangement of type involves the selection of typefaces, point size, line length, leading , adjusting the spaces between groups of letters and adjusting the space between pairs of letters...
, Greek
Greek alphabet
The Greek alphabet is the script that has been used to write the Greek language since at least 730 BC . The alphabet in its classical and modern form consists of 24 letters ordered in sequence from alpha to omega...
, Cyrillic
Cyrillic alphabet
The Cyrillic script or azbuka is an alphabetic writing system developed in the First Bulgarian Empire during the 10th century AD at the Preslav Literary School...
, Japanese (Hiragana
Hiragana
is a Japanese syllabary, one basic component of the Japanese writing system, along with katakana, kanji, and the Latin alphabet . Hiragana and katakana are both kana systems, in which each character represents one mora...
and Katakana
Katakana
is a Japanese syllabary, one component of the Japanese writing system along with hiragana, kanji, and in some cases the Latin alphabet . The word katakana means "fragmentary kana", as the katakana scripts are derived from components of more complex kanji. Each kana represents one mora...
) and some other characters.
This standard was previously known as KS C 5601. There have been several revisions of this standard. For example, there were revisions in 1987, 1992, 1998 and 2002.
Several computer operating systems encode various versions of this standard several ways. Not all of them encode the standard the same way, like replacing the typical backslash
Backslash
The backslash is a typographical mark used mainly in computing. It was first introduced to computers in 1960 by Bob Bemer. Sometimes called a reverse solidus or a slosh, it is the mirror image of the common slash....
at byte 0x
Hexadecimal
In mathematics and computer science, hexadecimal is a positional numeral system with a radix, or base, of 16. It uses sixteen distinct symbols, most often the symbols 0–9 to represent values zero to nine, and A, B, C, D, E, F to represent values ten to fifteen...
5C with the won
South Korean won
The won is the currency of South Korea. A single won is divided into 100 jeon, the monetary subunit. The jeon is no longer used for everyday transactions, and appears only in foreign exchange rates...
currency sign (₩).
Some operating systems extend this standard in other non-uniform ways. Possible encoding schemes of KS X 1001 are: EUC-KR, windows-949 (superset of EUC-KR), ISO-2022-KR and JOHAB. However, the latter two encodings are rarely used.