Affix grammar
Encyclopedia
An affix grammar is a kind of formal grammar
; it is used to describe the syntax
of languages, mainly computer languages, using an approach based on how natural language is typically described.
The grammatical rules of an affix grammar are those of a context-free grammar
, except that certain parts in the nonterminals (the affix
es) are used as arguments. If the same affix occurs multiple times in a rule, its value must agree
, i.e. it must be the same everywhere. In some types of affix grammar, more complex relationships between affix values are possible.
This context-free grammar
describes simple sentences such as
With more nouns and verbs, and more rules to introduce other parts of speech, a large range of English sentences can be described; so this is a promising approach for describing the syntax of English.
However, the given grammar also describes sentences such as
These sentences are wrong: in English, subject and verb have a grammatical number
, which must agree.
An affix grammar can express this directly:
This grammar only describes correct English sentences, although it could be argued that
is still incorrect and should instead read
This, too, can be incorporated using affixes, if the means of describing the relationships between different affix values are powerful enough. As remarked above, these means depend on the type of affix grammar chosen.
Applied in this way, affixes increase compactness of grammars, but do not add expressive power.
Another approach is to allow affixes to take arbitrary strings as values and allow concatenations of affixes to be used in rules. The ranges of allowable values for affixes can be described with context-free grammar rules. This produces the formalism of two-level grammars
, also known as Van Wijngaarden grammar
s or 2VW grammars. These have been successfully used to describe complicated languages, in particular, the syntax of the Algol 68
programming language
. However, it turns out that, even though affix values can only be manipulated with string concatenation, this formalism is Turing complete; hence, even the most basic questions about the language described by an arbitrary 2VW grammar are undecidable in general.
Extended Affix Grammar
s, developed in the 1980s, are a more restricted version of the same idea. They were mainly applied to describe the grammar of natural language, e.g. English.
Another possibility is to allow the values of affixes to be computed by code written in some programming language. Two basic approaches have been used:
Formal grammar
A formal grammar is a set of formation rules for strings in a formal language. The rules describe how to form strings from the language's alphabet that are valid according to the language's syntax...
; it is used to describe the syntax
Syntax
In linguistics, syntax is the study of the principles and rules for constructing phrases and sentences in natural languages....
of languages, mainly computer languages, using an approach based on how natural language is typically described.
The grammatical rules of an affix grammar are those of a context-free grammar
Context-free grammar
In formal language theory, a context-free grammar is a formal grammar in which every production rule is of the formwhere V is a single nonterminal symbol, and w is a string of terminals and/or nonterminals ....
, except that certain parts in the nonterminals (the affix
Affix
An affix is a morpheme that is attached to a word stem to form a new word. Affixes may be derivational, like English -ness and pre-, or inflectional, like English plural -s and past tense -ed. They are bound morphemes by definition; prefixes and suffixes may be separable affixes...
es) are used as arguments. If the same affix occurs multiple times in a rule, its value must agree
Agreement (linguistics)
In languages, agreement or concord is a form of cross-reference between different parts of a sentence or phrase. Agreement happens when a word changes form depending on the other words to which it relates....
, i.e. it must be the same everywhere. In some types of affix grammar, more complex relationships between affix values are possible.
Example
We can describe an extremely simple fragment of English in the following manner:
Sentence → Subject Predicate
Subject → Noun
Predicate → Verb Object
Object → Noun
Noun → John
Noun → Mary
Noun → children
Noun → parents
Verb → like
Verb → likes
Verb → help
Verb → helps
This context-free grammar
Context-free grammar
In formal language theory, a context-free grammar is a formal grammar in which every production rule is of the formwhere V is a single nonterminal symbol, and w is a string of terminals and/or nonterminals ....
describes simple sentences such as
John likes children
Mary helps John
children help parents
parents like John
With more nouns and verbs, and more rules to introduce other parts of speech, a large range of English sentences can be described; so this is a promising approach for describing the syntax of English.
However, the given grammar also describes sentences such as
John like children
children helps parents
These sentences are wrong: in English, subject and verb have a grammatical number
Grammatical number
In linguistics, grammatical number is a grammatical category of nouns, pronouns, and adjective and verb agreement that expresses count distinctions ....
, which must agree.
An affix grammar can express this directly:
Sentence → Subject+number Predicate+number
Subject+number → Noun+number
Predicate+number → Verb+number Object
Object → Noun+number
Noun+singular → John
Noun+singular → Mary
Noun+plural → children
Noun+plural → parents
Verb+singular → like
Verb+plural → likes
Verb+singular → help
Verb+plural → helps
This grammar only describes correct English sentences, although it could be argued that
John likes John
is still incorrect and should instead read
John likes himself
This, too, can be incorporated using affixes, if the means of describing the relationships between different affix values are powerful enough. As remarked above, these means depend on the type of affix grammar chosen.
Types of affix grammars
In the simplest type of affix grammar, affixes can only take values from a finite domain, and affix values can only be related through agreement, as in the example.Applied in this way, affixes increase compactness of grammars, but do not add expressive power.
Another approach is to allow affixes to take arbitrary strings as values and allow concatenations of affixes to be used in rules. The ranges of allowable values for affixes can be described with context-free grammar rules. This produces the formalism of two-level grammars
Two-level grammar
A two-level grammar is a formal grammar that is used to generate another formal grammar , such as one with an infinite rule set . This is how a Van Wijngaarden grammar was used to specify Algol68 . A context free grammar that defines the rules for a second grammar can yield an effectively infinite...
, also known as Van Wijngaarden grammar
Van Wijngaarden grammar
In computer science, a Van Wijngaarden grammar is a two-level grammar which provides a technique to define potentially infinite context-free grammars in a finite number of rules...
s or 2VW grammars. These have been successfully used to describe complicated languages, in particular, the syntax of the Algol 68
ALGOL 68
ALGOL 68 isan imperative computerprogramming language that was conceived as a successor to theALGOL 60 programming language, designed with the goal of a...
programming language
Programming language
A programming language is an artificial language designed to communicate instructions to a machine, particularly a computer. Programming languages can be used to create programs that control the behavior of a machine and/or to express algorithms precisely....
. However, it turns out that, even though affix values can only be manipulated with string concatenation, this formalism is Turing complete; hence, even the most basic questions about the language described by an arbitrary 2VW grammar are undecidable in general.
Extended Affix Grammar
Extended Affix Grammar
In computer science, Extended Affix Grammars are a formal grammar formalism for describing the context free and context sensitive syntax of language, both natural language and programming languages....
s, developed in the 1980s, are a more restricted version of the same idea. They were mainly applied to describe the grammar of natural language, e.g. English.
Another possibility is to allow the values of affixes to be computed by code written in some programming language. Two basic approaches have been used:
- In attribute grammarAttribute grammarAn attribute grammar is a formal way to define attributes for the productions of a formal grammar, associating these attributes to values. The evaluation occurs in the nodes of the abstract syntax tree, when the language is processed by some parser or compiler....
s, the affixes (called attributes) can take values from arbitrary domains (e.g. integer or real numbers, complex data structures) and arbitrary functions can be specified, written in a language of choice, to describe how affix values in rules are derived from each other. - In CDL (the Compiler Description LanguageCompiler Description LanguageCompiler Description Language, or CDL, is a Computer language based on affix grammars. It is very similar to Backus–Naur form notation. It was designed for the development of compilers. It is very limited in its capabilities and control flow; and intentionally so. The benefits of these limitations...
) and its successor CDL2, developed in the 1970s, fragments of source code (usually in assembly languageAssembly languageAn assembly language is a low-level programming language for computers, microprocessors, microcontrollers, and other programmable devices. It implements a symbolic representation of the machine codes and other constants needed to program a given CPU architecture...
) can be used in rules instead of normal right-hand sides, allowing primitives for input scanning and affix value computations to be expressed directly. Designed as a basis for practical compilerCompilerA compiler is a computer program that transforms source code written in a programming language into another computer language...
construction, this approach was used to write compilers, and other software, e.g. a text editorText editorA text editor is a type of program used for editing plain text files.Text editors are often provided with operating systems or software development packages, and can be used to change configuration files and programming language source code....
.