Swadesh list
Encyclopedia
A Swadesh list is one of several lists of vocabulary with basic meanings, developed by Morris Swadesh
from 1940 onward, with the final, posthumously published version 1971 [1972], which is used in lexicostatistics
(quantitative language relatedness assessment) and glottochronology
(language divergence dating).
Other versions of lexicostatistical test lists were published e.g. by R.B. Lees
(1953), John A. Rea (1958), D. Wilson (1969 with 57 meanings), M.L. Bender
(1969), R.L. Oswald (1971), W.P. Lehmann
(1984), D. Ringe (1992, passim, different versions), S.A. Starostin (1984, passim, different versions), William S.Y. Wang (1994), M. Lohr (2000, 128 meanings in 18 languages). B. Kessler (2002), and many others.
Frequently used, not for any proven quality, but for its electronical availability via the internet is the version of I.Dyen (1992, 200 meanings of 95 language variants).
to define the subgrouping of languages, and in glottochronology
to "provide dates for branching-points in the tree" . Note that the task of defining (and counting the number) of cognate words in the list is far from trivial, and too often is subject to dispute, because cognates do not necessarily look similar, and recognition of cognates presupposes knowledge of the sound laws of the respective languages. For example, English 'wheel' and Sanskrit 'chakra
' are cognates, although they are not recognizable as such without knowledge of the history of both languages.
For more details, see the main articles.
No. Meaning / Concept
by linguists such as Sergei Starostin
. With their Swadesh numbers, they are:
Holman et al. (2008) found that the Swadesh-Yakhontov list was less accurate than the Swadesh-100 list in identifying the relationships between Chinese dialects. However, they calculated the relative stability of the words by comparing retentions between languages in established language families, and found that a different 40-word list was just as accurate as the Swadesh-100 list. They found no statistically significant difference in the correlations in the families of the Old versus the New World. The ranked Swadesh-100 list, with Swadesh numbers and relative stability, is as follows (Holman et al., Appendix. Asterisked words appear on the 40-word list):
and Thailand
, linguist James Woodward noted that the traditional Swadesh list applied to spoken languages was unsuited for sign languages. The use of those lists would result in overestimation in the relationships between sign languages. He compiled a modification for sign language comparison by removing indexic signs which would show as potential cognates in other lists. As follows
Morris Swadesh
Morris Swadesh was an influential and controversial American linguist. In his work, he applied basic concepts in historical linguistics to the Indigenous languages of the Americas...
from 1940 onward, with the final, posthumously published version 1971 [1972], which is used in lexicostatistics
Lexicostatistics
Lexicostatistics is an approach to comparative linguistics that involves quantitative comparison of lexical cognates. Lexicostatistics is related to the comparative method but does not reconstruct a proto-language...
(quantitative language relatedness assessment) and glottochronology
Glottochronology
Glottochronology is that part of lexicostatistics dealing with the chronological relationship between languages....
(language divergence dating).
Versions and authors
There are several versions of Swadesh alone. Swadesh started (1950:p161) with a list of 225 meanings, reduced there to 165 words for the Salish language; in 1952: p456f he published a list of 215 meanings, of which he suggested to cancel 16 not universal or clear enough, with one added to arrive at 200 words. In 1955: p127 he again listed a "lexi(costatisti)cal test list" with 215 meanings, of which the 92 most favourable ones were marked with an asterisk. Eight better suitable ones were added to reach the full 100 universe, again improved and finally published in 1971:p283 and 1972. This final 100-word list of 1971:p283 (=1972) was the result of his lifetime experience, repeatedly tested for universal usability and unambiguity. Thus, this one alone deserves the label "Swadesh list" and is now listed below.Other versions of lexicostatistical test lists were published e.g. by R.B. Lees
Robert Lees (linguist)
-Education:Lees went to the Massachusetts Institute of Technology in 1956 to work on its machine translation project. He first came to notice with an influential review of Noam Chomsky's Syntactic Structures , and his 1960 book The Grammar of English Nominalizations...
(1953), John A. Rea (1958), D. Wilson (1969 with 57 meanings), M.L. Bender
Lionel Bender (linguist)
Marvin Lionel Bender was an American author and co-author of several books, publications and essays regarding African languages, particularly from Ethiopia and Sudan. He retired from Southern Illinois University Carbondale. He did extensive work in all four language families of Ethiopia: Semitic,...
(1969), R.L. Oswald (1971), W.P. Lehmann
Winfred P. Lehmann
Winfred P. Lehmann was an American linguist noted for his work in historical linguistics, particularly Proto-Indo-European and Proto-Germanic, as well as for pioneering work in machine translation.-Biography:After receiving B.A. in Humanities at the Northwestern College in Watertown in 1936, he...
(1984), D. Ringe (1992, passim, different versions), S.A. Starostin (1984, passim, different versions), William S.Y. Wang (1994), M. Lohr (2000, 128 meanings in 18 languages). B. Kessler (2002), and many others.
Frequently used, not for any proven quality, but for its electronical availability via the internet is the version of I.Dyen (1992, 200 meanings of 95 language variants).
Principle
One of frequent errors about Swadesh's principle has been that it had been chosen as a "basic" list in the sense of language acquisition, e.g. comparable to the "Basic English Vocabulary". A second frequent error is the assumption that Swadesh chose the meanings for their stability. In fact the lists were chosen for their universal, cultural independent, availability in as many languages as possible. Nevertheless, stability has been analyzed by different authors, e.g. M. Lohr 1999,2000.Usage in lexicostatistics and glottochronology
Such lexicostatistical test lists are used in lexicostatisticsLexicostatistics
Lexicostatistics is an approach to comparative linguistics that involves quantitative comparison of lexical cognates. Lexicostatistics is related to the comparative method but does not reconstruct a proto-language...
to define the subgrouping of languages, and in glottochronology
Glottochronology
Glottochronology is that part of lexicostatistics dealing with the chronological relationship between languages....
to "provide dates for branching-points in the tree" . Note that the task of defining (and counting the number) of cognate words in the list is far from trivial, and too often is subject to dispute, because cognates do not necessarily look similar, and recognition of cognates presupposes knowledge of the sound laws of the respective languages. For example, English 'wheel' and Sanskrit 'chakra
Chakra
Chakra is a concept originating in Hindu texts, featured in tantric and yogic traditions of Hinduism and Buddhism. Its name derives from the Sanskrit word for "wheel" or "turning" .Chakra is a concept referring to wheel-like vortices...
' are cognates, although they are not recognizable as such without knowledge of the history of both languages.
For more details, see the main articles.
Original final Swadesh list (1971:283, postmortem)
Essential explanations are only given in Swadesh 1952:456-7 and 1955* (Hans J. Holm 2011-04-20).No. Meaning / Concept
- I (Pers.Pron.1.Sg.)
- You (2.sg! 1952 thou & ye)
- we (1955: inclusive)
- this
- that
- who? (“?” not 1971)
- what? (“?” not 1971)
- not
- all (of a number)
- many
- one
- two
- big
- long (not 'wide')
- small
- woman
- man (male human)
- person (human being)
- fish (noun)
- bird
- dog
- louse
- tree (not log)
- seed (noun!)
- leaf (botanics)
- root (botanics)
- bark (of tree)
- skin (1952: person’s)
- flesh (1952 meat, flesh)
- blood
- bone
- grease (1952: fat, organic substance)
- egg
- horn (of bull etc, not 1952)*7
- tail
- feather (large, not down)
- hair (on head of humans)
- head (anatomic)
- ear
- eye
- nose
- mouth
- tooth (front, rather than molar)
- tongue (anatomical)
- claw (not in 1952)*6
- foot (not leg)
- knee (not 1952)*5
- hand
- belly (lower part of body, abdomen)
- neck (not nape!)
- breasts (female; 1955 still breast)*8
- heart
- liver
- drink (verb)
- eat (verb)
- bite (verb)
- see (verb)
- hear (verb)
- know (facts)
- sleep (verb)
- die (verb)
- kill (verb)
- swim (verb)
- fly (verb)
- walk (verb)
- come (verb)
- lie (on side, recline)
- sit (verb)
- stand (verb)
- give (verb)
- say (verb)*1
- sun
- moon (not 1952)*2
- star
- water (noun)
- rain (noun, 1952 verb)
- stone
- sand (opposite to following)
- earth (=soil)
- cloud (not fog)
- smoke (noun, of fire)
- fire
- ash(es)
- burn (verb intr.!)
- path (1952 road, trail; not street)
- mountain (not hill)
- red (colour)
- green (colour)
- yellow (colour)
- white (colour)
- black (colour)
- night
- hot (adverb; 1952 warm, of weather)
- cold (of weather)
- full *4
- new
- good
- round (not 1952)*3
- dry (substance!)
- name
Shorter lists
The Swadesh–Yakhontov list is a 35-word subset of the Swadesh list posited as especially stable by Russian linguist Sergei Yakhontov (Starostin 1991). It has been used in lexicostatisticsLexicostatistics
Lexicostatistics is an approach to comparative linguistics that involves quantitative comparison of lexical cognates. Lexicostatistics is related to the comparative method but does not reconstruct a proto-language...
by linguists such as Sergei Starostin
Sergei Starostin
Dr. Sergei Anatolyevich Starostin was a Russian historical linguist and scholar, best known for his work with hypothetical proto-languages, including his work on the reconstruction of the Proto-Borean language, the controversial theory of Altaic languages and the formulation of the Dené–Caucasian...
. With their Swadesh numbers, they are:
- 1. I
- 2. you (singular)
- 7. this
- 11. who
- 12. what
- 22. one
- 23. two
- 45. fish
- 47. dog
- 48. louse
- 64. blood
- 65. bone
- 67. egg
- 68. horn
- 69. tail
- 73. ear
- 74. eye
- 75. nose
- 77. tooth
- 78. tongue
- 83. hand
- 103. know
- 109. die
- 128. give
- 147. sun
- 148. moon
- 150. water
- 155. salt
- 156. stone
- 163. wind
- 167. fire
- 179. year
- 182. full
- 183. new
- 207. name
Holman et al. (2008) found that the Swadesh-Yakhontov list was less accurate than the Swadesh-100 list in identifying the relationships between Chinese dialects. However, they calculated the relative stability of the words by comparing retentions between languages in established language families, and found that a different 40-word list was just as accurate as the Swadesh-100 list. They found no statistically significant difference in the correlations in the families of the Old versus the New World. The ranked Swadesh-100 list, with Swadesh numbers and relative stability, is as follows (Holman et al., Appendix. Asterisked words appear on the 40-word list):
- 22 *louse (42.8)
- 12 *two (39.8)
- 75 *water (37.4)
- 39 *ear (37.2)
- 61 *die (36.3)
- 1 *I (35.9)
- 53 *liver (35.7)
- 40 *eye (35.4)
- 48 *hand (34.9)
- 58 *hear (33.8)
- 23 *tree (33.6)
- 19 *fish (33.4)
- 100 *name (32.4)
- 77 *stone (32.1)
- 43 *tooth (30.7)
- 51 *breasts (30.7)
- 2 *you (30.6)
- 85 *path (30.2)
- 31 *bone (30.1)
- 44 *tongue (30.1)
- 28 *skin (29.6)
- 92 *night (29.6)
- 25 *leaf (29.4)
- 76 rain (29.3)
- 62 kill (29.2)
- 30 *blood (29.0)
- 34 *horn (28.8)
- 18 *person (28.7)
- 47 *knee (28.0)
- 11 *one (27.4)
- 41 *nose (27.3)
- 95 *full (26.9)
- 66 *come (26.8)
- 74 *star (26.6)
- 86 *mountain (26.2)
- 82 *fire (25.7)
- 3 *we (25.4)
- 54 *drink (25.0)
- 57 *see (24.7)
- 27 bark (24.5)
- 96 *new (24.3)
- 21 *dog (24.2)
- 72 *sun (24.2)
- 64 fly (24.1)
- 32 grease (23.4)
- 73 moon (23.4)
- 70 give (23.3)
- 52 heart (23.2)
- 36 feather (23.1)
- 90 white (22.7)
- 89 yellow (22.5)
- 20 bird (21.8)
- 38 head (21.7)
- 79 earth (21.7)
- 46 foot (21.6)
- 91 black (21.6)
- 42 mouth (21.5)
- 88 green (21.1)
- 60 sleep (21.0)
- 7 what (20.7)
- 26 root (20.5)
- 45 claw (20.5)
- 56 bite (20.5)
- 83 ash (20.3)
- 87 red (20.2)
- 55 eat (20.0)
- 33 egg (19.8)
- 6 who (19.0)
- 99 dry (18.9)
- 37 hair (18.6)
- 81 smoke (18.5)
- 8 not (18.3)
- 4 this (18.2)
- 24 seed (18.2)
- 16 woman (17.9)
- 98 round (17.9)
- 14 long (17.4)
- 69 stand (17.1)
- 97 good (16.9)
- 17 man (16.7)
- 94 cold (16.6)
- 29 flesh (16.4)
- 50 neck (16.0)
- 71 say (16.0)
- 84 burn (15.5)
- 35 tail (14.9)
- 78 sand (14.9)
- 5 that (14.7)
- 65 walk (14.4)
- 68 sit (14.3)
- 10 many (14.2)
- 9 all (14.1)
- 59 know (14.1)
- 80 cloud (13.9)
- 63 swim (13.6)
- 49 belly (13.5)
- 13 big (13.4)
- 93 hot (11.6)
- 67 lie (11.2)
- 15 small (6.3)
Signed languages
In studying the sign languages of VietnamVietnamese sign languages
Sign language varieties in Ho Chi Minh city, Hanoi, and Haiphong are usually considered to be separate languages. However, there are attempts to develop a national standard language, Vietnamese Sign Language....
and Thailand
Thai Sign Language
Thai Sign Language or Modern Standard Thai Sign Language , is the national sign language of Thailand's Deaf community and is used in most parts of the country by the 20% of the estimated 56,000 pre-linguistically deaf people who go to school...
, linguist James Woodward noted that the traditional Swadesh list applied to spoken languages was unsuited for sign languages. The use of those lists would result in overestimation in the relationships between sign languages. He compiled a modification for sign language comparison by removing indexic signs which would show as potential cognates in other lists. As follows
- all
- animal
- bad
- because
- bird
- black
- blood
- child
- count
- day
- die
- dirty
- dog
- dry
- dull
- dust
- earth
- egg
- grease
- father
- feather
- fire
- fish
- flower
- good
- grass
- green
- heavy
- how
- hunt
- husband
- ice
- if
- kill
- laugh
- leaf
- lie
- live
- long
- louse
- man
- meat
- mother
- mountain
- name
- narrow
- new
- night
- not
- old
- other
- person
- play
- rain
- red
- correct
- river
- rope
- salt
- sea
- sharp
- short
- sing
- sit
- smooth
- snake
- snow
- stand
- star
- stone
- sun
- tail
- thin
- tree
- vomit
- warm
- water
- wet
- what
- when
- where
- white
- who
- wide
- wife
- wind
- with
- woman
- wood
- worm
- year
- yellow
- full
- moon
- brother
- cat
- dance
- pig
- sister
- work
See also
- A General Service List of English WordsA General Service List of English WordsThe General Service List is a list of roughly 2000 words published by Michael West in 1953. The words were selected to represent the most frequent words of English and were taken from a corpus of written English. The target audience was English language learners and ESL teachers...
- Basic EnglishBasic EnglishBasic English, also known as Simple English, is an English-based controlled language created by linguist and philosopher Charles Kay Ogden as an international auxiliary language, and as an aid for teaching English as a Second Language...
- CognateCognateIn linguistics, cognates are words that have a common etymological origin. This learned term derives from the Latin cognatus . Cognates within the same language are called doublets. Strictly speaking, loanwords from another language are usually not meant by the term, e.g...
- GlottochronologyGlottochronologyGlottochronology is that part of lexicostatistics dealing with the chronological relationship between languages....
- Historical linguisticsHistorical linguisticsHistorical linguistics is the study of language change. It has five main concerns:* to describe and account for observed changes in particular languages...
- Indo-European studiesIndo-European studiesIndo-European studies is a field of linguistics dealing with Indo-European languages, both current and extinct. Its goal is to amass information about the hypothetical proto-language from which all of these languages are descended, a language dubbed Proto-Indo-European , and its speakers, the...
- Intercontinental Dictionary SeriesIntercontinental Dictionary SeriesThe Intercontinental Dictionary Series is a large database of topical vocabulary lists in various world languages. The general editor of the database is Bernard Comrie of the Max Planck Institute for Evolutionary Anthropology, Leipzig. Mary Ritchie Key of the University of California, Irvine is the...
- LexicostatisticsLexicostatisticsLexicostatistics is an approach to comparative linguistics that involves quantitative comparison of lexical cognates. Lexicostatistics is related to the comparative method but does not reconstruct a proto-language...
- Mass lexical comparisonMass lexical comparisonMass comparison is a method developed by Joseph Greenberg to determine the level of genetic relatedness between languages. It is now usually called multilateral comparison...
- Proto-languageProto-languageA proto-language in the tree model of historical linguistics is the common ancestor of the languages that form a language family. Occasionally, the German term Ursprache is used instead.Often the proto-language is not known directly...
- Swadesh lists for hundreds of languages at Wiktionary, grouped by language family
- Swadesh lists for hundreds of languages at Wiktionary, listed by individual language
- The (brief) Wiktionary entry for the term 'Swadesh lists'