Adsotrans
Encyclopedia
Adso is a Chinese
Chinese language
The Chinese language is a language or language family consisting of varieties which are mutually intelligible to varying degrees. Originally the indigenous languages spoken by the Han Chinese in China, it forms one of the branches of Sino-Tibetan family of languages...

 to English
English language
English is a West Germanic language that arose in the Anglo-Saxon kingdoms of England and spread into what was to become south-east Scotland under the influence of the Anglian medieval kingdom of Northumbria...

 dictionary and natural language processing
Natural language processing
Natural language processing is a field of computer science and linguistics concerned with the interactions between computers and human languages; it began as a branch of artificial intelligence....

 engine for Chinese text. The Adso project started in 2001. Its gist translation and dictionary interface are online at the Adsotrans website http://adsotrans.com Adsotrans. Its software and database are freely available for download at the site as well http://adsotrans.com/downloads.

Content

With over 195,000 entries, Adso is the largest Chinese–English dictionary compilation on the Internet. It differs from other projects in providing part of speech and ontological data on word entries, and in reviewing user contributions. Project data is generated collaboratively by users through an online dictionary hosted at Popup Chinese.

The Adso software engine provides text segmentation, hanzi-to-pinyin, gist translation, annotation, gist extraction and semantic analysis services. It is heavily used as a translation aid for Chinese-English translation. Adso also supports a specially-defined XML language which customizes software output. This has made it useful as preprocessor for statistical machine translation
Statistical machine translation
Statistical machine translation is a machine translation paradigm where translations are generated on the basis of statistical models whose parameters are derived from the analysis of bilingual text corpora...

 software such as GIZA++ or for reverse-index search engines such as Lucene
Lucene
Apache Lucene is a free/open source information retrieval software library, originally created in Java by Doug Cutting. It is supported by the Apache Software Foundation and is released under the Apache Software License....

.

External links

The source of this article is wikipedia, the free encyclopedia.  The text of this article is licensed under the GFDL.
 
x
OK