Augmented Backus–Naur form
Encyclopedia
In computer science
, Augmented Backus–Naur Form (ABNF) is a metalanguage
based on Backus–Naur Form
(BNF), but consisting of its own syntax and derivation rules. The motive principle for ABNF is to describe a formal system
of a language to be used as a bidirectional communications protocol
. It is defined by Internet Standard 68 ("STD 68", type case sic), which is RFC 5234, and it often serves as the definition language for IETF communication protocols.
RFC 5234 supersedes RFC 4234 (which superseded RFC 2234).
where rule is a case-insensitive nonterminal, the definition consists of sequences of symbols that define the rule, a comment for documentation, and ending with a carriage return and line feed.
Rule names are case insensitive:
Angle brackets (“
Numeric characters may be specified as the percent sign “
Literal text is specified through the use of a string enclosed in quotation marks (
A rule may be defined by listing a sequence of rule names.
To match the string “aba” the following rules could be used:
A rule may be defined by a list of alternative rules separated by a solidus
("
To accept the rule foo or the rule bar the following rule could be constructed:
Additional alternatives may be added to a rule through the use of “
The rule
is equivalent to
A range of numeric values may be specified through the use of a hyphen (“
The rule
is equivalent to
Elements may be placed in parentheses to group rules in a definition.
To match “elem fubar snafu” or “elem tarfu snafu” the following rule could be constructed:
To match “elem fubar” or “tarfu snafu” the following rules could be constructed:
To indicate repetition of an element the form
Use
To indicate an explicit number of elements the form
Use
To indicate an optional element the following constructions are equivalent:
A semicolon (“
Use of the alternative operator with concatenation may be confusing and it is recommended that grouping be used to make explicit concatenation groups.
(BNF) page may be specified as follows:
Computer science
Computer science or computing science is the study of the theoretical foundations of information and computation and of practical techniques for their implementation and application in computer systems...
, Augmented Backus–Naur Form (ABNF) is a metalanguage
Metalanguage
Broadly, any metalanguage is language or symbols used when language itself is being discussed or examined. In logic and linguistics, a metalanguage is a language used to make statements about statements in another language...
based on Backus–Naur Form
Backus–Naur form
In computer science, BNF is a notation technique for context-free grammars, often used to describe the syntax of languages used in computing, such as computer programming languages, document formats, instruction sets and communication protocols.It is applied wherever exact descriptions of...
(BNF), but consisting of its own syntax and derivation rules. The motive principle for ABNF is to describe a formal system
Formal system
In formal logic, a formal system consists of a formal language and a set of inference rules, used to derive an expression from one or more other premises that are antecedently supposed or derived . The axioms and rules may be called a deductive apparatus...
of a language to be used as a bidirectional communications protocol
Communications protocol
A communications protocol is a system of digital message formats and rules for exchanging those messages in or between computing systems and in telecommunications...
. It is defined by Internet Standard 68 ("STD 68", type case sic), which is RFC 5234, and it often serves as the definition language for IETF communication protocols.
RFC 5234 supersedes RFC 4234 (which superseded RFC 2234).
Introduction
An ABNF specification is a set of derivation rules, written asrule = definition ; comment CR LF
where rule is a case-insensitive nonterminal, the definition consists of sequences of symbols that define the rule, a comment for documentation, and ending with a carriage return and line feed.
Rule names are case insensitive:
,
,
, and
all refer to the same rule. Rule names consist of a letter followed by letters, numbers, and hyphens.Angle brackets (“
<
”, “>
”) are not required around rule names (as they are in BNF). However they may be used to delimit a rule name when used in prose to discern a rule name.Terminal values
Terminals are specified by one or more numeric characters.Numeric characters may be specified as the percent sign “
%
”, followed by the base (b = binary, d = decimal, and x = hexadecimal), followed by the value, or concatenation of values (indicated by “.
”). For example a carriage return is specified by %d13
in decimal or %x0D
in hexadecimal. A carriage return followed by a line feed may be specified with concatenation as %d13.10
.Literal text is specified through the use of a string enclosed in quotation marks (
"
). These strings are case-insensitive and the character set used is (US-)ASCII. Therefore the string “abc” will match “abc”, “Abc”, “aBc”, “abC”, “ABc”, “AbC”, “aBC”, and “ABC”. For a case-sensitive match the explicit characters must be defined: to match “aBc” the definition will be %d97 %d66 %d99
.White space
White space is used to separate elements of a definition; for space to be recognized as a delimiter it must be explicitly included.Concatenation
Rule1 Rule2
A rule may be defined by listing a sequence of rule names.
To match the string “aba” the following rules could be used:
foo = %x61 ; a
bar = %x62 ; b
mumble = foo bar foo
Alternative
Rule1 / Rule2
A rule may be defined by a list of alternative rules separated by a solidus
Solidus (punctuation)
The solidus is a punctuation mark used to indicate fractions including fractional currency. It may also be called a shilling mark, an in-line fraction bar, or a fraction slash....
("
/
").To accept the rule foo or the rule bar the following rule could be constructed:
foobar = foo / bar
Incremental alternatives
Rule1 =/ Rule2
Additional alternatives may be added to a rule through the use of “
=/
” between the rule name and the definition.The rule
ruleset = alt1 / alt2 / alt3 / alt4 / alt5
is equivalent to
ruleset = alt1 / alt2
ruleset =/ alt3
ruleset =/ alt4 / alt5
Value range
%c##-##
A range of numeric values may be specified through the use of a hyphen (“
-
”).The rule
OCTAL = "0" / "1" / "2" / "3" / "4" / "5" / "6" / "7"
is equivalent to
OCTAL = %x30-37
Sequence group
(Rule1 Rule2)
Elements may be placed in parentheses to group rules in a definition.
To match “elem fubar snafu” or “elem tarfu snafu” the following rule could be constructed:
group = elem (fubar / tarfu) snafu
To match “elem fubar” or “tarfu snafu” the following rules could be constructed:
group = elem fubar / tarfu snafu
group = (elem fubar) / (tarfu snafu)
Variable repetition
n*nRule
To indicate repetition of an element the form
<a>*<b>element
is used. The optional <a>
gives the minimum number of elements to be included with the default of 0. The optional <b>
gives the maximum number of elements to be included with the default of infinity.Use
*element
for zero or more elements, 1*element
for one or more elements, and 2*3element
for two or three elements.Specific repetition
nRule
To indicate an explicit number of elements the form
<a>element
is used and is equivalent to <a>*<a>element
.Use
2DIGIT
to get two numeric digits and 3DIGIT
to get three numeric digits. (DIGIT is defined below under 'Core rules'. Also see zip-code in the example below.)Optional sequence
[Rule]
To indicate an optional element the following constructions are equivalent:
[fubar snafu]
*1(fubar snafu)
0*1(fubar snafu)
Comment
; comment
A semicolon (“
;
”) starts a comment that continues to the end of the line.Operator precedence
The following operators have the given precedence from tightest binding to loosest binding:- Strings, Names formation
- Comment
- Value range
- Repetition
- Grouping, Optional
- Concatenation
- Alternative
Use of the alternative operator with concatenation may be confusing and it is recommended that grouping be used to make explicit concatenation groups.
Core rules
The core rules are defined in the ABNF standard.Rule | Formal Definition | Meaning |
---|---|---|
ALPHA | %x41-5A / %x61-7A | Upper- and lower-case ASCII letters (A–Z, a–z) |
DIGIT | %x30-39 | Decimal digits (0–9) |
HEXDIG | DIGIT / "A" / "B" / "C" / "D" / "E" / "F" | Hexadecimal digits (0–9, A–F) |
DQUOTE | %x22 | Double Quote |
SP | %x20 | space |
HTAB | %x09 | horizontal tab |
WSP | SP / HTAB | space and horizontal tab |
LWSP | *(WSP / CRLF WSP) | linear white space (past newline) |
VCHAR | %x21-7E | visible (printing) characters |
CHAR | %x01-7F | any ASCII character, excluding NUL |
OCTET | %x00-FF | 8 bits of data |
CTL | %x00-1F / %x7F | controls |
CR | %x0D | carriage return |
LF | %x0A | linefeed |
CRLF | CR LF | Internet standard newline |
BIT | "0" / "1" | binary digit |
Example
The postal address example given in the Backus–Naur FormBackus–Naur form
In computer science, BNF is a notation technique for context-free grammars, often used to describe the syntax of languages used in computing, such as computer programming languages, document formats, instruction sets and communication protocols.It is applied wherever exact descriptions of...
(BNF) page may be specified as follows: