Linguistic distance
Encyclopedia
Linguistic distance is a term loosely used to describe how different one language
or dialect
is from another. Although there is no uniform approach to quantifying linguistic distance between languages, the concept is used in a variety of linguistic
situations, such as learning additional languages
, historical linguistics
, language-based conflicts and the effects of language differences on trade.
, i.e. the ability of speakers of one language to understand the other language. With this, the higher the linguistic distance, the lower is the level of mutual intelligibility.
Since cognate words
play an important role in mutual intelligibility between languages, these figure prominently in such analyses. The higher the percentage of cognate (as opposed to non-cognate) words in the two languages with respect to one another, the lower is their linguistic distance. Also, the greater the degree of grammatical relatedness (i.e. the cognates mean roughly similar things) and lexical relatedness (i.e. the cognates are easily discernible as related words), the lower is the linguistic distance. As an example of this, the Hindi-Urdu word pānch is grammatically identical and lexically similar (but non-identical) to its cognate Punjabi
and Persian
word panj as well as to the lexically dissimilar but still grammatically identical Latin
pent- and English
five. As another example, the English dish and the German
tisch (meaning table) are lexically similar but grammatically dissimilar. Using a statistical approach (called lexicostatistics
) by comparing each language's mass of words, distances can be calculated between them. In technical terms, what is calculated is the Levenshtein distance
. Based on this, one study compared both Afrikaans and Frisian
with Dutch
to see which was closer to Dutch. It determined that the Dutch and Afrikaans (mutual distance of 20.9%) were considerably closer than Dutch and Frisian (mutual distance of 34.2%). Besides cognates, other aspects that are often measured are similarities of syntax
and written forms.
A 2004 paper by linguists Barry Chiswick and Paul Miller attempted to put forth a metric for linguistic distances that was based on empirical observations of how rapidly speakers of a given language gained proficiency in another one when immersed in a society that overwhelmingly communicated in the latter language. In this study, the speed of English language acquisition was studied for immigrants of various linguistic backgrounds in the United States
and Canada
.
Language
Language may refer either to the specifically human capacity for acquiring and using complex systems of communication, or to a specific instance of such a system of complex communication...
or dialect
Dialect
The term dialect is used in two distinct ways, even by linguists. One usage refers to a variety of a language that is a characteristic of a particular group of the language's speakers. The term is applied most often to regional speech patterns, but a dialect may also be defined by other factors,...
is from another. Although there is no uniform approach to quantifying linguistic distance between languages, the concept is used in a variety of linguistic
Linguistics
Linguistics is the scientific study of human language. Linguistics can be broadly broken into three categories or subfields of study: language form, language meaning, and language in context....
situations, such as learning additional languages
Second language acquisition
Second-language acquisition or second-language learning is the process by which people learn a second language. Second-language acquisition is also the name of the scientific discipline devoted to studying that process...
, historical linguistics
Historical linguistics
Historical linguistics is the study of language change. It has five main concerns:* to describe and account for observed changes in particular languages...
, language-based conflicts and the effects of language differences on trade.
Measures
The proposed measures used for linguisitic distance reflect varying understandings of the term itself. One approach is based on mutual intelligibilityMutual intelligibility
In linguistics, mutual intelligibility is recognized as a relationship between languages or dialects in which speakers of different but related languages can readily understand each other without intentional study or extraordinary effort...
, i.e. the ability of speakers of one language to understand the other language. With this, the higher the linguistic distance, the lower is the level of mutual intelligibility.
Since cognate words
Cognate
In linguistics, cognates are words that have a common etymological origin. This learned term derives from the Latin cognatus . Cognates within the same language are called doublets. Strictly speaking, loanwords from another language are usually not meant by the term, e.g...
play an important role in mutual intelligibility between languages, these figure prominently in such analyses. The higher the percentage of cognate (as opposed to non-cognate) words in the two languages with respect to one another, the lower is their linguistic distance. Also, the greater the degree of grammatical relatedness (i.e. the cognates mean roughly similar things) and lexical relatedness (i.e. the cognates are easily discernible as related words), the lower is the linguistic distance. As an example of this, the Hindi-Urdu word pānch is grammatically identical and lexically similar (but non-identical) to its cognate Punjabi
Punjabi language
Punjabi is an Indo-Aryan language spoken by inhabitants of the historical Punjab region . For Sikhs, the Punjabi language stands as the official language in which all ceremonies take place. In Pakistan, Punjabi is the most widely spoken language...
and Persian
Persian language
Persian is an Iranian language within the Indo-Iranian branch of the Indo-European languages. It is primarily spoken in Iran, Afghanistan, Tajikistan and countries which historically came under Persian influence...
word panj as well as to the lexically dissimilar but still grammatically identical Latin
Latin
Latin is an Italic language originally spoken in Latium and Ancient Rome. It, along with most European languages, is a descendant of the ancient Proto-Indo-European language. Although it is considered a dead language, a number of scholars and members of the Christian clergy speak it fluently, and...
pent- and English
English language
English is a West Germanic language that arose in the Anglo-Saxon kingdoms of England and spread into what was to become south-east Scotland under the influence of the Anglian medieval kingdom of Northumbria...
five. As another example, the English dish and the German
German language
German is a West Germanic language, related to and classified alongside English and Dutch. With an estimated 90 – 98 million native speakers, German is one of the world's major languages and is the most widely-spoken first language in the European Union....
tisch (meaning table) are lexically similar but grammatically dissimilar. Using a statistical approach (called lexicostatistics
Lexicostatistics
Lexicostatistics is an approach to comparative linguistics that involves quantitative comparison of lexical cognates. Lexicostatistics is related to the comparative method but does not reconstruct a proto-language...
) by comparing each language's mass of words, distances can be calculated between them. In technical terms, what is calculated is the Levenshtein distance
Levenshtein distance
In information theory and computer science, the Levenshtein distance is a string metric for measuring the amount of difference between two sequences...
. Based on this, one study compared both Afrikaans and Frisian
Frisian language
The Frisian languages are a closely related group of Germanic languages, spoken by about 500,000 members of Frisian ethnic groups, who live on the southern fringes of the North Sea in the Netherlands and Germany. The Frisian languages are the second closest related living European languages to...
with Dutch
Dutch language
Dutch is a West Germanic language and the native language of the majority of the population of the Netherlands, Belgium, and Suriname, the three member states of the Dutch Language Union. Most speakers live in the European Union, where it is a first language for about 23 million and a second...
to see which was closer to Dutch. It determined that the Dutch and Afrikaans (mutual distance of 20.9%) were considerably closer than Dutch and Frisian (mutual distance of 34.2%). Besides cognates, other aspects that are often measured are similarities of syntax
Syntax
In linguistics, syntax is the study of the principles and rules for constructing phrases and sentences in natural languages....
and written forms.
A 2004 paper by linguists Barry Chiswick and Paul Miller attempted to put forth a metric for linguistic distances that was based on empirical observations of how rapidly speakers of a given language gained proficiency in another one when immersed in a society that overwhelmingly communicated in the latter language. In this study, the speed of English language acquisition was studied for immigrants of various linguistic backgrounds in the United States
United States
The United States of America is a federal constitutional republic comprising fifty states and a federal district...
and Canada
Canada
Canada is a North American country consisting of ten provinces and three territories. Located in the northern part of the continent, it extends from the Atlantic Ocean in the east to the Pacific Ocean in the west, and northward into the Arctic Ocean...
.
See also
- Language transfer
- LexicostatisticsLexicostatisticsLexicostatistics is an approach to comparative linguistics that involves quantitative comparison of lexical cognates. Lexicostatistics is related to the comparative method but does not reconstruct a proto-language...
- Second-language acquisition
- Historical linguisticsHistorical linguisticsHistorical linguistics is the study of language change. It has five main concerns:* to describe and account for observed changes in particular languages...