Meaning-Text Theory
Encyclopedia
Meaning–text theory is a theoretical linguistic
framework, first put forward in Moscow by Aleksandr Žolkovskij and Igor Mel’čuk
, for the construction of models of natural language. The theory provides a large and elaborate basis for linguistic description and, due to its formal
character, lends itself particularly well to computer applications
, including machine translation
, phraseology
, and lexicography
. General overviews of the theory can be found in Mel’čuk (1981) and (1988).
Representations at the different levels are mapped, in sequence, from the unordered network of the semantic representation (SemR) through the dependency tree-structures of the Syntactic Representation (SyntR) to a linearized chain of morphemes of the Morphological Representation (MorphR) and, ultimately, the temporally-ordered string of phones of the Phonetic Representation (PhonR) (not generally addressed in work in this theory). The relationships between representations on the different levels are considered to be translations or mappings, rather than transformations, and are mediated by sets of rules, called "components", which ensure the appropriate, language-specific transitions between levels.
, which constitute the Syntactic Structure (SyntS). SyntS is accompanied by various other types of structure, most notably the syntactic communicative structure and the anaphoric structure. There are two levels of syntax in MTT, the deep syntactic representation (DSyntR) and the surface syntactic representation (SSyntR). A good overview of MTT syntax, including its descriptive application, can be found in Mel’čuk (1988). A comprehensive model of English surface syntax is presented in Mel’čuk & Pertsov (1987).
The deep syntactic representation (DSyntR) is related directly to SemS and seeks to capture the "universal" aspects of the syntactic structure. Trees at this level represent dependency relations between lexemes (or between lexemes and a limited inventory of abstract entities such as lexical functions). Deep syntactic relations between lexemes at DSyntR are restricted to a universal inventory of a dozen or syntactic relations including seven ranked actantial (argument) relations, the modificative relation, and the coordinative relation. Lexemes with purely grammatical function such as lexically-governed prepositions are not included at this level of representation; values of inflectional categories that are derived from SemR but implemented by the morphology are represented as subscripts on the relevant lexical nodes that they bear on. DSyntR is mapped onto the next level of representation by rules of the deep-syntactic component.
The surface-syntactic representation (SSyntR) represents the language-specific syntactic structure of an utterance and includes nodes for all the lexical items (including those with purely grammatical function) in the sentence. Syntactic relations between lexical items at this level are not restricted and are considered to be completely language-specific, although many are believed to be similar (or at least isomorphic) across languages. SSyntR is mapped onto the next level of representation by rules of the surface-syntactic component.
Deep Morphological Representation (DMorphR) consists of strings of lexemes and morphemes—e.g., THE SHOE+{PL} ON BILL+{POSS} FOOT+{PL}. The deep morphological component of rules maps this string onto the Surface Morphological Representation (SMorphR), converting morphemes into the appropriate morphs and performing morphological operations implementing non-concatenative morphological processes—in the case of our example above, giving us /the shoe+s on Bill+s feet/. Rules of the surface morphological component, a subset of which include morphophonemic rules, map the SMorphR onto a phonetic representation [ðə ʃuz an bɪlz fit].
(LUs) of a language, these units being the lexemes, collocations and other phrasemes
, constructions, and other configurations of linguistic elements that are learned and implemented in speech by users of language. The lexicon in MTT is represented by the Explanatory Combinatorial Dictionary
(ECD) which includes entries for all of the LUs of a language along with information speakers must know regarding their syntactics (the LU-specific rules and conditions on their combinatorics). An ECD for Russian was produced by Mel’čuk et al. (1984), and ECDs for French were published as Mel’čuk et al. (1999) and Mel’čuk & Polguère (2007).
of the Meaning-Text Theory'
Theoretical linguistics
Theoretical linguistics is the branch of linguistics that is most concerned with developing models of linguistic knowledge. The fields that are generally considered the core of theoretical linguistics are syntax, phonology, morphology, and semantics...
framework, first put forward in Moscow by Aleksandr Žolkovskij and Igor Mel’čuk
Igor Mel'cuk
Igor Aleksandrovič Mel'čuk is a retired professor at the Department of linguistics and translation, Université de Montréal.He graduated from the Moscow State University's Philological department. Since 1956 he has worked for the Institute of the Science of Language in Moscow. Since 1974, he has...
, for the construction of models of natural language. The theory provides a large and elaborate basis for linguistic description and, due to its formal
Formal theory
Formal theory can refer to:* Another name for a theory which is expressed in formal language.* An axiomatic system, something representable by symbols and its operators...
character, lends itself particularly well to computer applications
Natural language processing
Natural language processing is a field of computer science and linguistics concerned with the interactions between computers and human languages; it began as a branch of artificial intelligence....
, including machine translation
Machine translation
Machine translation, sometimes referred to by the abbreviation MT is a sub-field of computational linguistics that investigates the use of computer software to translate text or speech from one natural language to another.On a basic...
, phraseology
Phraseology
In linguistics, phraseology is the study of set or fixed expressions, such as idioms, phrasal verbs, and other types of multi-word lexical units , in which the component parts of the expression take on a meaning more specific than or otherwise not predictable from the sum of their meanings when...
, and lexicography
Lexicography
Lexicography is divided into two related disciplines:*Practical lexicography is the art or craft of compiling, writing and editing dictionaries....
. General overviews of the theory can be found in Mel’čuk (1981) and (1988).
Levels of representation
Linguistic models in MTT operate on the principle that language consists in a mapping from the content or meaning (semantics) of an utterance to its form or text (phonetics). Intermediate between these poles are additional levels of representation at the syntactic and morphological levels.Representations at the different levels are mapped, in sequence, from the unordered network of the semantic representation (SemR) through the dependency tree-structures of the Syntactic Representation (SyntR) to a linearized chain of morphemes of the Morphological Representation (MorphR) and, ultimately, the temporally-ordered string of phones of the Phonetic Representation (PhonR) (not generally addressed in work in this theory). The relationships between representations on the different levels are considered to be translations or mappings, rather than transformations, and are mediated by sets of rules, called "components", which ensure the appropriate, language-specific transitions between levels.
Semantic Representation
Semantic representations (SemR) in meaning–text theory consist primarily of a web-like semantic structure (SemS) which combines with other semantic-level structures (most notably the Semantic-Communicative Structure [SemCommS], which represents what is commonly referred to as "Information Structure" in other frameworks). The SemS itself consists of a network of predications, represented as nodes with arrows running from predicate nodes to argument node(s). Arguments can be shared by multiple predicates, and predicates can themselves be arguments of other predicates. Nodes generally correspond to lexical and grammatical meanings as these are directly expressed by items in the lexicon or by inflectional means, but the theory allows the option of decomposing meanings into more fine-grained representation via processes of semantic paraphrasing, which are also key to dealing with synonymy and translation-equivalencies between languages. SemRs are mapped onto the next level of representation, the Deep-Syntactic Representation, by the rules of the Semantic Component, which allow for a one to many relationship between levels (that is, one SemR can potentially be expressed by a variety of syntactic structures, depending on lexical choice, the complexity of the SemR, etc.).Syntactic Representation
Syntactic representations in MTT are implemented using dependency treesDependency grammar
Dependency grammar is a class of modern syntactic theories that are all based on the dependency relation and that can be traced back primarily to the work of Lucien Tesnière. Dependency grammars are distinct from phrase structure grammars , since they lack phrasal nodes. Structure is determined by...
, which constitute the Syntactic Structure (SyntS). SyntS is accompanied by various other types of structure, most notably the syntactic communicative structure and the anaphoric structure. There are two levels of syntax in MTT, the deep syntactic representation (DSyntR) and the surface syntactic representation (SSyntR). A good overview of MTT syntax, including its descriptive application, can be found in Mel’čuk (1988). A comprehensive model of English surface syntax is presented in Mel’čuk & Pertsov (1987).
The deep syntactic representation (DSyntR) is related directly to SemS and seeks to capture the "universal" aspects of the syntactic structure. Trees at this level represent dependency relations between lexemes (or between lexemes and a limited inventory of abstract entities such as lexical functions). Deep syntactic relations between lexemes at DSyntR are restricted to a universal inventory of a dozen or syntactic relations including seven ranked actantial (argument) relations, the modificative relation, and the coordinative relation. Lexemes with purely grammatical function such as lexically-governed prepositions are not included at this level of representation; values of inflectional categories that are derived from SemR but implemented by the morphology are represented as subscripts on the relevant lexical nodes that they bear on. DSyntR is mapped onto the next level of representation by rules of the deep-syntactic component.
The surface-syntactic representation (SSyntR) represents the language-specific syntactic structure of an utterance and includes nodes for all the lexical items (including those with purely grammatical function) in the sentence. Syntactic relations between lexical items at this level are not restricted and are considered to be completely language-specific, although many are believed to be similar (or at least isomorphic) across languages. SSyntR is mapped onto the next level of representation by rules of the surface-syntactic component.
Morphological representation
Morphological Representations in MTT are implemented as strings of morphemes arranged in a fixed linear order reflecting the ordering of elements in the actual utterance. It should be noted that this is the first representational level at which linear precedence is considered to be linguistically significant, effectively grouping word-order together with morphological processes and prosody, as one of the three non-lexical means with which languages can encode syntactic structure. As with Syntactic Representation, there are two levels of Morphological Representation—Deep and Surface Morphological Representation. Detailed descriptions of MTT Morphological Representations are found in Mel’čuk (1993–2000) and Mel’čuk (2006).Deep Morphological Representation (DMorphR) consists of strings of lexemes and morphemes—e.g., THE SHOE+{PL} ON BILL+{POSS} FOOT+{PL}. The deep morphological component of rules maps this string onto the Surface Morphological Representation (SMorphR), converting morphemes into the appropriate morphs and performing morphological operations implementing non-concatenative morphological processes—in the case of our example above, giving us /the shoe+s on Bill+s feet/. Rules of the surface morphological component, a subset of which include morphophonemic rules, map the SMorphR onto a phonetic representation [ðə ʃuz an bɪlz fit].
The lexicon
A crucial aspect of MTT is the lexicon, considered to be a comprehensive catalogue of the lexical unitsLexical item
A Lexical item is a single word or chain of words that forms the basic elements of a language's lexicon . Examples are "cat", "traffic light", "take care of", "by-the-way", and "it's raining cats and dogs"...
(LUs) of a language, these units being the lexemes, collocations and other phrasemes
Phraseme
A phraseme, also called a set expression, set phrase, idiomatic phrase, multi-word expression, or idiom, is a multi-word or multi-morphemic utterance at least one of whose components is selectionally constrained or restricted by linguistic convention such that it is not freely chosen...
, constructions, and other configurations of linguistic elements that are learned and implemented in speech by users of language. The lexicon in MTT is represented by the Explanatory Combinatorial Dictionary
Explanatory Combinatorial Dictionary
A Explanatory Combinatorial Dictionary is a formalized semantically-based lexicon designed to be part of a Meaning-Text linguistic model of a natural language. It contains the set of all minimal lexical signs of a language—that is, its Lexical Units...
(ECD) which includes entries for all of the LUs of a language along with information speakers must know regarding their syntactics (the LU-specific rules and conditions on their combinatorics). An ECD for Russian was produced by Mel’čuk et al. (1984), and ECDs for French were published as Mel’čuk et al. (1999) and Mel’čuk & Polguère (2007).
Lexical functions
One important discovery of meaning–text linguistics was the recognition that LUs in a language can be related to one another in an abstract semantic sense and that this same relation also holds across many lexically-unrelated pairs or sets of LUs. These relations are represented in MTT as lexical functions (LF). An example of a simple LF is Magn(L), which represents collocations used in intensification such as heavy rain, strong wind, or intense bombardment. A speaker of English knows that for a given lexeme L such as RAIN the value of Magn(RAIN) = HEAVY, whereas Magn(WIND) = STRONG, and so on. MTT currently recognizes several dozen standard LFs that are known to recur across languages.External links
- The Meaning-Text Theory web site, hosts the proceedings of the biannual MTT-conference
- Observatoire de linguistique Sens-Texte (OLST)
- Meaning–Text @ neuvel.net, an excellent introduction to the theory
- Meaning–Text on-line library
Meaning-text Software
- Semantic search engine based on Meaning-Text theory provided by Inbenta
- Carabao Language Kit provided by Digital Sonata
- ETAP-3 Linguistic Processing System, described as 'a Full-Fledged NLP Implementation
of the Meaning-Text Theory'
General
- Žolkovskij, A.K. and Mel’čuk, Igor A. (1965). O vozmožnom metode i instrumentax semantičeskogo sinteza (On a possible method and instruments for semantic synthesis). Naučno-texničeskaja informacija 5, 23–28.
- И. А. МельчукIgor Mel'cukIgor Aleksandrovič Mel'čuk is a retired professor at the Department of linguistics and translation, Université de Montréal.He graduated from the Moscow State University's Philological department. Since 1956 he has worked for the Institute of the Science of Language in Moscow. Since 1974, he has...
. Опыт теории лингвистических моделей «Смысл ↔ Текст». М., 1974 (2nd ed., 1999). - И. А. Мельчук. Русский язык в модели «Смысл ↔ Текст». Москва-Вена, 1995.
- I. A. Mel’čuk. Vers une linguistique Sens-Texte. Leçon inaugurale. P.: Collège de France, Chaire internationale, 1997.
- Leo Wanner (ed.), Recent Trends in Meaning-Text Theory. Amsterdam, Philadelphia: J. Benjamins Pub., 1997. ISBN 1-55619-925-2, ISBN 90-272-3042-0
- I.A. Bolshakov, A.F. Gelbukh. The Meaning-Text Model: Thirty Years After J. International Forum on Information and Documentation, FID 519, ISSN 0304-9701, N 1, 2000.
Syntax
- И. А. Мельчук. Поверхностный синтаксис русских числовых выражений. Wien: Wiener Slawistischer Almanach, 1985.
- I. A. Mel’čuk & N. V. Pertsov. Surface syntax of English: A formal model within the Meaning-Text framework. Amsterdam, Philadelphia: Benjamins, 1987. ISBN 90-272-1515-4
- I. A. Mel’čuk. Dependency syntax: Theory and practice. Albany, NY: SUNY, 1988. ISBN 0-88706-450-7, ISBN 0-88706-451-5
- I. A. Mel’čuk. Actants in Semantics and Syntax. I,II, Linguistics, 2004, 42:1, 1–66; 42:2, 247—291.
Morphology
- I. A. Mel'čuk. Cours de morphologie générale, vol. 1–5. Montréal: Les Presses de l’Université de Montréal/Paris: CNRS Éditions, 1993—2000
- I. A. Mel'čuk. Aspects of the Theory of Morphology. Berlin; New York: Mouton de Gruyter, 2006. ISBN 3-11-017711-0
Lexicography
- I. A. Mel’čuk, A. K. Zholkovsky, Ju. D. Apresjan et al. Explanatory Combinatorial Dictionary of Modern Russian: Semantico-Syntactic Studies of Russian Vocabulary / Толково-комбинаторный словарь современного русского языка: Опыты семантико-синтаксического описания русской лексики. Wien: Wiener Slawistischer Almanach, 1984.
- I. A. Mel’čuk, A. Clas & A. Polguère. Introduction à la lexicologie explicative et combinatoire. P.: Duculot, 1995. — ISBN 2-8011-1106-6
- I. A. Mel'čuk et al. Dictionnaire explicatif et combinatoire du français contemporain. Recherches lexico-sémantiques IV, Montréal: Les Presses de l’Université de Montréal, 1999. — ISBN 2-7606-1738-6