Wiktionary
Encyclopedia
Wiktionary is a multilingual
, web
-based project to create a free content
dictionary
, available in 158 languages. Unlike standard dictionaries, it is written collaboratively by volunteers
, dubbed "Wiktionarians", using wiki software, allowing articles to be changed by almost anyone with access to the website.
Like its sister project Wikipedia
, Wiktionary is run by the Wikimedia Foundation
. Because Wiktionary is not limited by print space considerations, most of Wiktionary's language editions provide definitions and translations of words from many languages, and some editions offer additional information typically found in thesauri
and lexicon
s. Additionally, the English Wiktionary includes Wikisaurus, a category that serves as a thesaurus, including lists of slang
words, and the Simple English Wiktionary, compiled using the Basic English
subset of the English language.
The goal of Wiktionary is to eventually define "all words in all languages."
Wiktionaries were initiated in French
and Polish
. Wiktionaries in numerous other languages have since been started. Wiktionary was hosted on a temporary URL
(wiktionary.wikipedia.org) until May 1, 2004, when it switched to the current full URL. , Wiktionary features well over 5 million entries across its 272 language editions. The largest of the language editions is the English Wiktionary, with over 2.5 million entries. The French Wiktionary is the second largest with over 2 million entries. Nineteen Wiktionary language editions now contain over 100,000 entries each.
Most of the entries and many of the definitions at the project's largest language editions were created by bot
s that found creative ways to generate entries or (rarely) automatically imported thousands of entries from previously published dictionaries. Seven of the 18 bots registered at the English Wiktionary created 163,000 of the entries there. Only 259 entries remain (each containing many definitions) on Wiktionary from the original import by Websterbot from public domain sources; the majority of those imports have been split out to thousands of proper entries manually. Another one of these bots, "ThirdPersBot," was responsible for the addition of a number of third-person
conjugation
s that would not receive their own entries in standard dictionaries; for instance, it defined "smoulders" as the "third-person singular simple present form of smoulder." Excluding these 163,000 entries, the English Wiktionary would have about 137,000 entries, including terms unique to languages other than English, making it smaller than most monolingual print dictionaries. The Oxford English Dictionary
, for instance, has 615,000 headwords, while Merriam-Webster's Third New International Dictionary of the English Language, Unabridged has 475,000 entries (with many additional embedded headwords). It should be noted, though, that more detailed statistics now exist to distinguish more clearly the main entries from sub-entries.
The English Wiktionary, however does not rely on bots to the extent that newer editions do. The French
and Vietnamese
Wiktionaries, for example, imported large sections of the Free Vietnamese Dictionary Project (FVDP), which provides free content bilingual dictionaries to and from Vietnamese. These imported entries make up virtually all of the Vietnamese edition's offering. Like the English edition, the French Wiktionary has imported the approximately 20,000 entries in the Unihan
database of Chinese, Japanese, and Korean characters. The French Wiktionary grew rapidly in 2006 thanks in large part to bots copying many entries from old, freely licensed dictionaries, such as the eighth edition of the Dictionnaire de l'Académie française
(1935, around 35,000 words), and using bots to add words from other Wiktionary editions with French translations. The Russian
edition grew by nearly 80,000 entries as "LXbot" added boilerplate entries (with headings, but without definitions) for words in English and German
.
developer. Despite frequent discussion of modifying or replacing the logo, a four-phase contest held at the Wikimedia Meta-Wiki from September to October 2006 did not see as much participation from the Wiktionary community as some community members had hoped. The logo that won was designed by "Smurrayinchester". By December 2009, 23 of the Wiktionary editions, containing about half of Wiktionary's entries, had switched to the contest-chosen "wooden tile" design or variations of it. In April 2009, the issue was resurrected, and the Wiktionary community voted on a new project-wide logo. As of 2011, the Lithuanian and Tatar Wiktionaries have begun using a logo designed by A. A. Engelman that won the second round of the latest poll.
As of August 15, 2011, 135 wikis (representing 50.4% of Wiktionary's entries) use the original textual design by Vibber, 30 (44.4%) the design by "Smurrayinchester", two (4.91%) the design by Engelman, and one (0.27%) a logo that depicts a dictionary bearing the Galician coat of arms.
Keir Graff
’s review for Booklist was less critical:
References in other publications are fleeting and part of larger discussions of Wikipedia, not progressing beyond a definition, although David Brooks in The Nashua Telegraph described it as wild and woolly. (Wooly is defined as "confused" and "unrestrained.") One of the impediments to independent coverage of Wiktionary is the continuing confusion that it is merely an extension of Wikipedia.
In 2005, PC Magazine
rated Wiktionary as one of the Internet's "Top 101 Web Sites," although little information was given about the site.
The measure of correctness of the inflections for a subset of the Polish words in the English Wiktionary showed that this grammatical data is very stable. Only 131 out of 4748 Polish words have had their inflection
data corrected.
Multilingualism
Multilingualism is the act of using, or promoting the use of, multiple languages, either by an individual speaker or by a community of speakers. Multilingual speakers outnumber monolingual speakers in the world's population. Multilingualism is becoming a social phenomenon governed by the needs of...
, web
World Wide Web
The World Wide Web is a system of interlinked hypertext documents accessed via the Internet...
-based project to create a free content
Free content
Free content, or free information, is any kind of functional work, artwork, or other creative content that meets the definition of a free cultural work...
dictionary
Dictionary
A dictionary is a collection of words in one or more specific languages, often listed alphabetically, with usage information, definitions, etymologies, phonetics, pronunciations, and other information; or a book of words in one language with their equivalents in another, also known as a lexicon...
, available in 158 languages. Unlike standard dictionaries, it is written collaboratively by volunteers
Volunteering
Volunteering is generally considered an altruistic activity, intended to promote good or improve human quality of life, but people also volunteer for their own skill development, to meet others, to make contacts for possible employment, to have fun, and a variety of other reasons that could be...
, dubbed "Wiktionarians", using wiki software, allowing articles to be changed by almost anyone with access to the website.
Like its sister project Wikipedia
Wikipedia
Wikipedia is a free, web-based, collaborative, multilingual encyclopedia project supported by the non-profit Wikimedia Foundation. Its 20 million articles have been written collaboratively by volunteers around the world. Almost all of its articles can be edited by anyone with access to the site,...
, Wiktionary is run by the Wikimedia Foundation
Wikimedia Foundation
Wikimedia Foundation, Inc. is an American non-profit charitable organization headquartered in San Francisco, California, United States, and organized under the laws of the state of Florida, where it was initially based...
. Because Wiktionary is not limited by print space considerations, most of Wiktionary's language editions provide definitions and translations of words from many languages, and some editions offer additional information typically found in thesauri
Thesaurus
A thesaurus is a reference work that lists words grouped together according to similarity of meaning , in contrast to a dictionary, which contains definitions and pronunciations...
and lexicon
Lexicon
In linguistics, the lexicon of a language is its vocabulary, including its words and expressions. A lexicon is also a synonym of the word thesaurus. More formally, it is a language's inventory of lexemes. Coined in English 1603, the word "lexicon" derives from the Greek "λεξικόν" , neut...
s. Additionally, the English Wiktionary includes Wikisaurus, a category that serves as a thesaurus, including lists of slang
Slang
Slang is the use of informal words and expressions that are not considered standard in the speaker's language or dialect but are considered more acceptable when used socially. Slang is often to be found in areas of the lexicon that refer to things considered taboo...
words, and the Simple English Wiktionary, compiled using the Basic English
Basic English
Basic English, also known as Simple English, is an English-based controlled language created by linguist and philosopher Charles Kay Ogden as an international auxiliary language, and as an aid for teaching English as a Second Language...
subset of the English language.
The goal of Wiktionary is to eventually define "all words in all languages."
History and development
Wiktionary was brought online on December 12, 2002, following a proposal by Daniel Alston and an idea by Larry Sanger, co-founder of Wikipedia. On March 28, 2004, the first non-EnglishEnglish language
English is a West Germanic language that arose in the Anglo-Saxon kingdoms of England and spread into what was to become south-east Scotland under the influence of the Anglian medieval kingdom of Northumbria...
Wiktionaries were initiated in French
French language
French is a Romance language spoken as a first language in France, the Romandy region in Switzerland, Wallonia and Brussels in Belgium, Monaco, the regions of Quebec and Acadia in Canada, and by various communities elsewhere. Second-language speakers of French are distributed throughout many parts...
and Polish
Polish language
Polish is a language of the Lechitic subgroup of West Slavic languages, used throughout Poland and by Polish minorities in other countries...
. Wiktionaries in numerous other languages have since been started. Wiktionary was hosted on a temporary URL
Uniform Resource Locator
In computing, a uniform resource locator or universal resource locator is a specific character string that constitutes a reference to an Internet resource....
(wiktionary.wikipedia.org) until May 1, 2004, when it switched to the current full URL. , Wiktionary features well over 5 million entries across its 272 language editions. The largest of the language editions is the English Wiktionary, with over 2.5 million entries. The French Wiktionary is the second largest with over 2 million entries. Nineteen Wiktionary language editions now contain over 100,000 entries each.
Most of the entries and many of the definitions at the project's largest language editions were created by bot
Internet bot
Internet bots, also known as web robots, WWW robots or simply bots, are software applications that run automated tasks over the Internet. Typically, bots perform tasks that are both simple and structurally repetitive, at a much higher rate than would be possible for a human alone...
s that found creative ways to generate entries or (rarely) automatically imported thousands of entries from previously published dictionaries. Seven of the 18 bots registered at the English Wiktionary created 163,000 of the entries there. Only 259 entries remain (each containing many definitions) on Wiktionary from the original import by Websterbot from public domain sources; the majority of those imports have been split out to thousands of proper entries manually. Another one of these bots, "ThirdPersBot," was responsible for the addition of a number of third-person
Grammatical person
Grammatical person, in linguistics, is deictic reference to a participant in an event; such as the speaker, the addressee, or others. Grammatical person typically defines a language's set of personal pronouns...
conjugation
Grammatical conjugation
In linguistics, conjugation is the creation of derived forms of a verb from its principal parts by inflection . Conjugation may be affected by person, number, gender, tense, aspect, mood, voice, or other grammatical categories...
s that would not receive their own entries in standard dictionaries; for instance, it defined "smoulders" as the "third-person singular simple present form of smoulder." Excluding these 163,000 entries, the English Wiktionary would have about 137,000 entries, including terms unique to languages other than English, making it smaller than most monolingual print dictionaries. The Oxford English Dictionary
Oxford English Dictionary
The Oxford English Dictionary , published by the Oxford University Press, is the self-styled premier dictionary of the English language. Two fully bound print editions of the OED have been published under its current name, in 1928 and 1989. The first edition was published in twelve volumes , and...
, for instance, has 615,000 headwords, while Merriam-Webster's Third New International Dictionary of the English Language, Unabridged has 475,000 entries (with many additional embedded headwords). It should be noted, though, that more detailed statistics now exist to distinguish more clearly the main entries from sub-entries.
The English Wiktionary, however does not rely on bots to the extent that newer editions do. The French
French language
French is a Romance language spoken as a first language in France, the Romandy region in Switzerland, Wallonia and Brussels in Belgium, Monaco, the regions of Quebec and Acadia in Canada, and by various communities elsewhere. Second-language speakers of French are distributed throughout many parts...
and Vietnamese
Vietnamese language
Vietnamese is the national and official language of Vietnam. It is the mother tongue of 86% of Vietnam's population, and of about three million overseas Vietnamese. It is also spoken as a second language by many ethnic minorities of Vietnam...
Wiktionaries, for example, imported large sections of the Free Vietnamese Dictionary Project (FVDP), which provides free content bilingual dictionaries to and from Vietnamese. These imported entries make up virtually all of the Vietnamese edition's offering. Like the English edition, the French Wiktionary has imported the approximately 20,000 entries in the Unihan
Han unification
Han unification is an effort by the authors of Unicode and the Universal Character Set to map multiple character sets of the so-called CJK languages into a single set of unified characters. Han characters are a common feature of written Chinese , Japanese , Korean , and—at least historically—other...
database of Chinese, Japanese, and Korean characters. The French Wiktionary grew rapidly in 2006 thanks in large part to bots copying many entries from old, freely licensed dictionaries, such as the eighth edition of the Dictionnaire de l'Académie française
Dictionnaire de l'Académie française
The Dictionnaire de l'Académie française is the official dictionary of the French language.The Académie française is France's official authority on the usages, vocabulary, and grammar of the French language, although its recommendations carry no legal power...
(1935, around 35,000 words), and using bots to add words from other Wiktionary editions with French translations. The Russian
Russian language
Russian is a Slavic language used primarily in Russia, Belarus, Uzbekistan, Kazakhstan, Tajikistan and Kyrgyzstan. It is an unofficial but widely spoken language in Ukraine, Moldova, Latvia, Turkmenistan and Estonia and, to a lesser extent, the other countries that were once constituent republics...
edition grew by nearly 80,000 entries as "LXbot" added boilerplate entries (with headings, but without definitions) for words in English and German
German language
German is a West Germanic language, related to and classified alongside English and Dutch. With an estimated 90 – 98 million native speakers, German is one of the world's major languages and is the most widely-spoken first language in the European Union....
.
Logos
The various Wiktionary language editions use a total of four basic logo designs. Most of Wiktionary currently uses a textual logo designed by Brion Vibber, a MediaWikiMediaWiki
MediaWiki is a popular free web-based wiki software application. Developed by the Wikimedia Foundation, it is used to run all of its projects, including Wikipedia, Wiktionary and Wikinews. Numerous other wikis around the world also use it to power their websites...
developer. Despite frequent discussion of modifying or replacing the logo, a four-phase contest held at the Wikimedia Meta-Wiki from September to October 2006 did not see as much participation from the Wiktionary community as some community members had hoped. The logo that won was designed by "Smurrayinchester". By December 2009, 23 of the Wiktionary editions, containing about half of Wiktionary's entries, had switched to the contest-chosen "wooden tile" design or variations of it. In April 2009, the issue was resurrected, and the Wiktionary community voted on a new project-wide logo. As of 2011, the Lithuanian and Tatar Wiktionaries have begun using a logo designed by A. A. Engelman that won the second round of the latest poll.
As of August 15, 2011, 135 wikis (representing 50.4% of Wiktionary's entries) use the original textual design by Vibber, 30 (44.4%) the design by "Smurrayinchester", two (4.91%) the design by Engelman, and one (0.27%) a logo that depicts a dictionary bearing the Galician coat of arms.
Accuracy
To ensure accuracy, Wiktionary has a policy stating that entries should be attested, that is, verified through either:- Clearly widespread use,
- Usage in a well-known work, or
- Usage in permanently recorded media, conveying meaning, in at least three independent instances spanning at least a year.
Critical reception
Critical reception of Wiktionary has been mixed. Jill Lepore wrote in the article "Noah’s Ark" for The New Yorker, (November 6, 2006)There’s no show of hands at Wiktionary. There’s not even an editorial staff. "Be your own lexicographer!", might be Wiktionary’s motto. Who needs experts? Why pay good money for a dictionary written by lexicographers when we can cobble one together ourselves?Wiktionary isn’t so much republican or democratic as Maoist. And it’s only as good as the copyright-expired books from which it pilfers. If you look up the word "Webster" in the Wiktionary, you will be redirected to this handy tip:
Noah Webster’s New International Dictionary of the English Language, 1911 (published by Merriam-Webster, Springfield, MA) is a public domain dictionary, as is a 1913 edition, that can be used to empower Wiktionary with more definitions.
But, hey, at least they got his first name right.
Keir Graff
Keir Graff
Keir Graff is an American novelist and literary editor.-Biography:Graff was born and raised in Missoula, Montana. He has had four novels published and is also the editor of , a publication of the American Library Association that consists chiefly of book reviews. He currently resides in...
’s review for Booklist was less critical:
Is there a place for Wiktionary? Undoubtedly. The industry and enthusiasm of its many creators are proof that there’s a market. And it’s wonderful to have another strong source to use when searching the odd terms that pop up in today’s fast-changing world and the online environment. But as with so many Web sources (including this column), it’s best used by sophisticated users in conjunction with more reputable sources.
References in other publications are fleeting and part of larger discussions of Wikipedia, not progressing beyond a definition, although David Brooks in The Nashua Telegraph described it as wild and woolly. (Wooly is defined as "confused" and "unrestrained.") One of the impediments to independent coverage of Wiktionary is the continuing confusion that it is merely an extension of Wikipedia.
In 2005, PC Magazine
PC Magazine
PC Magazine is a computer magazine published by Ziff Davis Publishing Holdings Inc. A print edition was published from 1982 to January 2009...
rated Wiktionary as one of the Internet's "Top 101 Web Sites," although little information was given about the site.
The measure of correctness of the inflections for a subset of the Polish words in the English Wiktionary showed that this grammatical data is very stable. Only 131 out of 4748 Polish words have had their inflection
data corrected.
External links
- List of all Wiktionary editions
- Wiktionary front page
- Wiktionary's Multilingual Statistics
- Wikimedia's page on Wiktionary (including list of all existing Wiktionaries)
- Pages about Wiktionary in Meta.
- Meta:Main Page – OmegaWiki