CEDICT
Encyclopedia
The CEDICT project was started by Paul Denisowski in 1997 and is presently maintained by MDBG, under the name CC-CEDICT, with the aim to provide a complete Chinese
to English
dictionary with pronunciation in pinyin
for the Chinese characters.
The basic format of a CEDICT entry is:
Traditional Simplified [pin1 yin1] /American English equivalent 1/equivalent 2/
中國 中国 [Zhong1 guo2] /China/Middle Kingdom/
CEDICT is now primarily encoded in UTF-8
.
Features:
As of March 9, 2011, it has 100,151 Chinese entries http://www.mdbg.net/chindict/chindict.php?page=cc-cedict.
dictionary.
Chinese language
The Chinese language is a language or language family consisting of varieties which are mutually intelligible to varying degrees. Originally the indigenous languages spoken by the Han Chinese in China, it forms one of the branches of Sino-Tibetan family of languages...
to English
English language
English is a West Germanic language that arose in the Anglo-Saxon kingdoms of England and spread into what was to become south-east Scotland under the influence of the Anglian medieval kingdom of Northumbria...
dictionary with pronunciation in pinyin
Pinyin
Pinyin is the official system to transcribe Chinese characters into the Roman alphabet in China, Malaysia, Singapore and Taiwan. It is also often used to teach Mandarin Chinese and spell Chinese names in foreign publications and used as an input method to enter Chinese characters into...
for the Chinese characters.
Content
CEDICT is merely a text file; other programs are needed to search and display it. This project is considered a standard Chinese-English reference on the Internet and is used by several other Chinese-English projects. The Unihan Database uses CEDICT data for most of its information about character compounds, but this is auxiliary and is explicitly not a part of the main Unicode database http://unicode.org/charts/unihan.html. CEDICT is not used for Unihan's definitions and pronunciations of individual characters.The basic format of a CEDICT entry is:
Traditional Simplified [pin1 yin1] /American English equivalent 1/equivalent 2/
中國 中国 [Zhong1 guo2] /China/Middle Kingdom/
CEDICT is now primarily encoded in UTF-8
UTF-8
UTF-8 is a multibyte character encoding for Unicode. Like UTF-16 and UTF-32, UTF-8 can represent every character in the Unicode character set. Unlike them, it is backward-compatible with ASCII and avoids the complications of endianness and byte order marks...
.
Features:
- Traditional and Simplified Chinese
- Pinyin (several pronunciations)
- American English (several)
As of March 9, 2011, it has 100,151 Chinese entries http://www.mdbg.net/chindict/chindict.php?page=cc-cedict.
History
Year | Event |
---|---|
1991 | EDICT EDICT The JMdict/EDICT project was started by Jim Breen in 1991 with the aim to provide a machine-readable Japanese to English dictionary. Since that time it has been updated and expanded by many contributors. The dictionaries resulting from the project are simply text files; other programs are needed to... Japanese Japanese language is a language spoken by over 130 million people in Japan and in Japanese emigrant communities. It is a member of the Japonic language family, which has a number of proposed relationships with other languages, none of which has gained wide acceptance among historical linguists .Japanese is an... dictionary project was started by Jim Breen Jim Breen James William Breen is a Research Fellow at Monash University in Australia, where he was a professor in the area of telecommunications before his retirement in 2003... . |
1997 | CEDICT project started by Paul Denisowski, on the model of EDICT. |
1999 | CEDICT ownership transferred to Erik Peterson of http://www.mandarintools.com/cedict.html. |
2007 | MDBG started a new project called CC-CEDICT which continues the CEDICT project with a new license: Creative Commons Attribution-Share Alike 3.0 License, allowing more projects to use it. Additionally a work flow http://cc-cedict.org/editor/ has been set up to streamline the process of submitting, reviewing and processing new entries. |
Sub-projects
CEDICT has shown the way to some other projects, such HanDeDict (127,000 Chinese entries), the Chinese-German free dictionary, and CFDICT (200,000 entries) for French. A Hungarian–Chinese dictionary project is under discussion. Some older CEDICT data is also found in the AdsotransAdsotrans
Adso is a Chinese to English dictionary and natural language processing engine for Chinese text. The Adso project started in 2001. Its gist translation and dictionary interface are online at the Adsotrans website Adsotrans...
dictionary.
External links
CEDICT based dictionaries- Flashonary is a Chinese Dictionary with integrated flashcards that uses CC-CEDICT.
- MDBG free online Chinese–English dictionary uses CC-CEDICT, supports adding / editing entries and offers recent CC-CEDICT downloads.
- more information on the formatting of CC-CEDICT
- Example of CEDICT data for the han character " 中 ", use by Unihan (Section "Chinese Compounds")
- Chinese Dictionaries Discussion group about Chinese->"foreign language" dictionaries
- The homepage of Paul Denisowski, the founder of CEDICT
- Talaqa Chinese–English dictionary : An interactive Chinese–English dictionary based on CEDICT.
- www.clearchinese.com uses CEDICT
- Mandarin Text Project uses CEDICT