Extended Affix Grammar
Encyclopedia
In computer science
, Extended Affix
Grammars (EAG) are a formal grammar
formalism for describing the context free
and context sensitive
syntax
of language, both natural language
and programming language
s.
EAGs are a member of the family of two-level grammar
s; more specifically, a restriction of Van Wijngaarden grammar
s with the specific purpose of making parsing
feasible.
Like Van Wijngaarden grammars, EAGs have hyperrules that form a context-free grammar
except in that their nonterminals may have arguments, known as affixes, the possible values of which are supplied by another context-free grammar, the metarules.
EAGs introduced and studied by D.A. Watt in 1974; recognizers were developed at the University of Nijmegen between 1985 and 1995. The EAG compiler developed there will generate either a recogniser, a transducer, a translator, or a syntax directed editor for a language described in the EAG formalism. The formalism is quite similar to Prolog
, to the extent that it borrowed its cut operator.
EAGs have been used to write grammars of natural languages such as English, Spanish, and Hungarian. The aim was to verify the grammars by making them parse corpora of text (corpus linguistics
); hence, parsing had to be sufficiently practical. However, the parse tree explosion problem that ambiguities in natural language tend to produce in this type of approach is worsened for EAGs because each choice of affix value may produce a separate parse, even when several different values are equivalent. The remedy proposed was to switch to the much simpler Affix Grammar over a Finite
Lattices (AGFL
) instead, in which metagrammars can only produce simple finite languages.
Computer science
Computer science or computing science is the study of the theoretical foundations of information and computation and of practical techniques for their implementation and application in computer systems...
, Extended Affix
Affix
An affix is a morpheme that is attached to a word stem to form a new word. Affixes may be derivational, like English -ness and pre-, or inflectional, like English plural -s and past tense -ed. They are bound morphemes by definition; prefixes and suffixes may be separable affixes...
Grammars (EAG) are a formal grammar
Formal grammar
A formal grammar is a set of formation rules for strings in a formal language. The rules describe how to form strings from the language's alphabet that are valid according to the language's syntax...
formalism for describing the context free
Context-free grammar
In formal language theory, a context-free grammar is a formal grammar in which every production rule is of the formwhere V is a single nonterminal symbol, and w is a string of terminals and/or nonterminals ....
and context sensitive
Context-sensitive grammar
A context-sensitive grammar is a formal grammar in which the left-hand sides and right-hand sides of any production rules may be surrounded by a context of terminal and nonterminal symbols...
syntax
Syntax
In linguistics, syntax is the study of the principles and rules for constructing phrases and sentences in natural languages....
of language, both natural language
Natural language
In the philosophy of language, a natural language is any language which arises in an unpremeditated fashion as the result of the innate facility for language possessed by the human intellect. A natural language is typically used for communication, and may be spoken, signed, or written...
and programming language
Programming language
A programming language is an artificial language designed to communicate instructions to a machine, particularly a computer. Programming languages can be used to create programs that control the behavior of a machine and/or to express algorithms precisely....
s.
EAGs are a member of the family of two-level grammar
Two-level grammar
A two-level grammar is a formal grammar that is used to generate another formal grammar , such as one with an infinite rule set . This is how a Van Wijngaarden grammar was used to specify Algol68 . A context free grammar that defines the rules for a second grammar can yield an effectively infinite...
s; more specifically, a restriction of Van Wijngaarden grammar
Van Wijngaarden grammar
In computer science, a Van Wijngaarden grammar is a two-level grammar which provides a technique to define potentially infinite context-free grammars in a finite number of rules...
s with the specific purpose of making parsing
Parsing
In computer science and linguistics, parsing, or, more formally, syntactic analysis, is the process of analyzing a text, made of a sequence of tokens , to determine its grammatical structure with respect to a given formal grammar...
feasible.
Like Van Wijngaarden grammars, EAGs have hyperrules that form a context-free grammar
Context-free grammar
In formal language theory, a context-free grammar is a formal grammar in which every production rule is of the formwhere V is a single nonterminal symbol, and w is a string of terminals and/or nonterminals ....
except in that their nonterminals may have arguments, known as affixes, the possible values of which are supplied by another context-free grammar, the metarules.
EAGs introduced and studied by D.A. Watt in 1974; recognizers were developed at the University of Nijmegen between 1985 and 1995. The EAG compiler developed there will generate either a recogniser, a transducer, a translator, or a syntax directed editor for a language described in the EAG formalism. The formalism is quite similar to Prolog
Prolog
Prolog is a general purpose logic programming language associated with artificial intelligence and computational linguistics.Prolog has its roots in first-order logic, a formal logic, and unlike many other programming languages, Prolog is declarative: the program logic is expressed in terms of...
, to the extent that it borrowed its cut operator.
EAGs have been used to write grammars of natural languages such as English, Spanish, and Hungarian. The aim was to verify the grammars by making them parse corpora of text (corpus linguistics
Corpus linguistics
Corpus linguistics is the study of language as expressed in samples or "real world" text. This method represents a digestive approach to deriving a set of abstract rules by which a natural language is governed or else relates to another language. Originally done by hand, corpora are now largely...
); hence, parsing had to be sufficiently practical. However, the parse tree explosion problem that ambiguities in natural language tend to produce in this type of approach is worsened for EAGs because each choice of affix value may produce a separate parse, even when several different values are equivalent. The remedy proposed was to switch to the much simpler Affix Grammar over a Finite
Finite
Finite is the opposite of infinite. It may refer to:*Finite set, having a number of elements given by some natural number*Finite verb, being inflected for person and for tense...
Lattices (AGFL
Affix grammar over a finite lattice
In linguistics, the affix grammars over a finite lattice formalism is a notation for context-free grammars with finite set-valued features, acceptable to linguists of many different schools....
) instead, in which metagrammars can only produce simple finite languages.
See also
- affix grammarAffix grammarAn affix grammar is a kind of formal grammar; it is used to describe the syntax of languages, mainly computer languages, using an approach based on how natural language is typically described....
- Van Wijngaarden grammarVan Wijngaarden grammarIn computer science, a Van Wijngaarden grammar is a two-level grammar which provides a technique to define potentially infinite context-free grammars in a finite number of rules...
- corpus linguisticsCorpus linguisticsCorpus linguistics is the study of language as expressed in samples or "real world" text. This method represents a digestive approach to deriving a set of abstract rules by which a natural language is governed or else relates to another language. Originally done by hand, corpora are now largely...