Parsing table
Encyclopedia
A parsing table is the part of a parser that makes decisions about how to treat input tokens in compiler
development.
that is generated from the context-free grammar
of the language to be parsed. This is possible because the set of viable prefix
es (that is, the sequence of input that can be set aside before a parser needs to either perform a reduction or return an error) of a context-free language
is a regular language
, so the parser can use a simple parse state table to recognize each viable prefix and determine what needs to be done next.
A parsing table is used with a stack and an input buffer. The stack starts out empty, and symbols are shifted onto it one by one from the input buffer with associated states; in each step, the parsing table is consulted to decide what action to take.
More simply, an Action table defines what to do when a terminal symbol is encountered, and a goto table defines what to do when a nonterminal is encountered.
The parsing table consists of two parts, the action table and the goto table. The action table takes the state at the top of the stack and the next symbol in the input buffer (called the "lookahead" symbol) and returns the action to take, and the next state to push onto the stack. The goto table returns the next state for the parser to enter when a reduction exposes a new state on the top of the stack. The goto table is needed to convert the operation of the deterministic finite automaton of the Action table to a pushdown automaton
.
For example, a parsing table might take a current position in state 1 and an input of "+" from the input buffer, and return instructions to shift the current symbol onto the stack and move to state 4. More precisely, the parser would push the token "+" onto the stack, followed by the symbol 4. Or, for an example of the use of a goto table, the current stack might contain "E+E", which is supposed to reduce to "T". After that reduction, looking up "T" in the goto table row corresponding to the state currently at the top of the stack (that is, the state that was under the "E+E" popped off to do the reduction) would tell the parser to push state 2 on top of "T". Action may be a Reduction, a Shift, or Accept or None.
Compiler
A compiler is a computer program that transforms source code written in a programming language into another computer language...
development.
Overview
A parsing table is a table describing what action its parser should take when a given input comes while it is in a given state. It is a tabular representation of a pushdown automatonPushdown automaton
In automata theory, a pushdown automaton is a finite automaton that can make use of a stack containing data.- Operation :Pushdown automata differ from finite state machines in two ways:...
that is generated from the context-free grammar
Context-free grammar
In formal language theory, a context-free grammar is a formal grammar in which every production rule is of the formwhere V is a single nonterminal symbol, and w is a string of terminals and/or nonterminals ....
of the language to be parsed. This is possible because the set of viable prefix
Viable prefix
The set of prefixes of right sentential forms that can appear on the stack of a shift-reduce parser are called viable prefixes. An equivalent definition of a viable prefix is that it is a prefix of right sentential form that does not continue past the right end of the rightmost handle of that...
es (that is, the sequence of input that can be set aside before a parser needs to either perform a reduction or return an error) of a context-free language
Context-free language
In formal language theory, a context-free language is a language generated by some context-free grammar. The set of all context-free languages is identical to the set of languages accepted by pushdown automata.-Examples:...
is a regular language
Regular language
In theoretical computer science and formal language theory, a regular language is a formal language that can be expressed using regular expression....
, so the parser can use a simple parse state table to recognize each viable prefix and determine what needs to be done next.
A parsing table is used with a stack and an input buffer. The stack starts out empty, and symbols are shifted onto it one by one from the input buffer with associated states; in each step, the parsing table is consulted to decide what action to take.
More simply, an Action table defines what to do when a terminal symbol is encountered, and a goto table defines what to do when a nonterminal is encountered.
The parsing table consists of two parts, the action table and the goto table. The action table takes the state at the top of the stack and the next symbol in the input buffer (called the "lookahead" symbol) and returns the action to take, and the next state to push onto the stack. The goto table returns the next state for the parser to enter when a reduction exposes a new state on the top of the stack. The goto table is needed to convert the operation of the deterministic finite automaton of the Action table to a pushdown automaton
Pushdown automaton
In automata theory, a pushdown automaton is a finite automaton that can make use of a stack containing data.- Operation :Pushdown automata differ from finite state machines in two ways:...
.
For example, a parsing table might take a current position in state 1 and an input of "+" from the input buffer, and return instructions to shift the current symbol onto the stack and move to state 4. More precisely, the parser would push the token "+" onto the stack, followed by the symbol 4. Or, for an example of the use of a goto table, the current stack might contain "E+E", which is supposed to reduce to "T". After that reduction, looking up "T" in the goto table row corresponding to the state currently at the top of the stack (that is, the state that was under the "E+E" popped off to do the reduction) would tell the parser to push state 2 on top of "T". Action may be a Reduction, a Shift, or Accept or None.
Reduction
- Happens when the parser recognizes a sequence of symbols to represent a single nonterminal, and substitutes that nonterminal for the sequence. In an LR parserLR parserIn computer science, an LR parser is a parser that reads input from Left to right and produces a Rightmost derivation. The term LR parser is also used; where the k refers to the number of unconsumed "look ahead" input symbols that are used in making parsing decisions...
, this means that the parser will pop 2*N entries from the stack (N being the length of the sequence recognized - this number is doubled because with each token pushed to the stack, a state is pushed on top, and removing the token requires that the parser also remove the state) and push the recognized nonterminal in its place. This nonterminal will not have an associated state, so to determine the next parse state, the parser will use the goto table.
- This explanation can be made more precise using the "E+E" example above. Suppose the input buffer is currently empty and the contents of the stack are, from bottom to top:
- 0 E 1 + 5 E 1
- The parser sees that there are no more tokens in the input stream and looks up the empty token in the parsing table. The table contains an instruction, for example, "r4" - this means to use reduction 4 to reduce the contents of the stack. Reduction 4, part of the grammar used to specify the parser's context-free languageContext-free languageIn formal language theory, a context-free language is a language generated by some context-free grammar. The set of all context-free languages is identical to the set of languages accepted by pushdown automata.-Examples:...
, should be something along the lines of:
- T -> E+E
- If the parser were to reduce the "E+E" to a "T", it would pop six symbols from the stack, leaving the stack containing:
- and then push the new "T", followed by the state number found in the goto table at row 0, column T.
Shift
- In this case, the parser makes the choice to shift the next symbol onto the stack. The token is moved from the input buffer to the stack along with an associated state, also specified in the action table entry, which will be the next parser state. The parser then advances to the next token.
Accept
- When a parse is complete and the sequence of tokens in question can be matched to an accept state in the parse table, the parser terminates, and may report success in parsing the input. The input in question is valid for the selected grammar.