SECD machine
Encyclopedia
The SECD machine is a highly influential virtual machine
and abstract machine
intended as a target for functional programming language
compilers. The letters stand for Stack, Environment, Code, Dump, the internal registers of the machine. These registers point to linked list
s in memory.
The machine was the first to be specifically designed to evaluate lambda calculus
expressions. It was originally described by Peter J. Landin
as part of his ISWIM programming language definition in 1963. The description published by Landin was fairly abstract, and left many implementation choices open (like an operational semantics
). Hence the SECD machine is often presented in a more detailed form, such as Peter Henderson's Lispkit Lisp
compiler, which has been distributed since 1980. Since then it has been used as the target for several other experimental compilers.
In 1989 researchers at the University of Calgary
worked on a hardware implementation of the machine.
. Functions take their arguments from the stack. The arguments to built-in instructions are encoded immediately after them in the instruction stream.
Like all internal data-structures, the stack is a list, with the S register pointing at the list's head or beginning. Due to the list structure, the stack need not be a continuous block of memory, so stack space is available as long as there is a single free memory cell. Even when all cells have been used, garbage collection
may yield additional free memory. Obviously, specific implementations of the SECD structure can implement the stack as a canonical stack structure, so improving the overall efficiency of the virtual machine, provided that a strict bound be put on the dimension of the stack.
The C register points at the head of the code or instruction list that will be evaluated. Once the instruction there has been executed, the C is pointed at the next instruction in the list—it is similar to an instruction pointer (or program counter
) in conventional machines, except that subsequent instructions are always specified during execution and are not by default contained in subsequent memory locations, as it is the case with the conventional machines.
The current variable environment is managed by the E register, which points at a list of lists. Each individual list represents one environment level: the parameters of the current function are in the head of the list, variables that are free in the current function, but bound by a surrounding function, are in other elements of E.
The dump, at whose head the D register points, is used as temporary storage for values of the other registers, for example during function calls. It can be likened to the return stack of other machines.
The memory organization of the SECD machine is similar to the model used by most functional language interpreters: a number of memory cells, each of which can hold either an atom (a simple value, for example 13), or represent an empty or non-empty list. In the latter case, the cell holds two pointers to other cells, one representing the first element, the other representing the list except for the first element. The two pointers are traditionally named car and cdr respectively—but the more modern terms head and tail are often used instead. The different types of values that a cell can hold are distinguished by a tag. Often different types of atoms (integers, strings, etc.) are distinguished as well.
So a list holding the numbers 1, 2, and 3, usually written as "(1 2 3)", could be represented as follows:
Address Tag Content (value for integers, car & cdr for lists)
9 [ integer | 2 ]
8 [ integer | 3 ]
7 [ list | 8 | 0 ]
6 [ list | 9 | 7 ]
...
2 [ list | 1 | 6 ]
1 [ integer | 1 ]
0 [ nil ]
The memory cells 3 to 5 do not belong to our list, the cells of which can be distributed randomly over the memory. Cell 2 is the head of the list, it points to cell 1 which holds the first element's value, and the list containing only 2 and 3 (beginning at cell 6). Cell 6 points at a cell holding 2 and at cell 7, which represents the list containing only 3. It does so by pointing at cell 8 containing the value 3, and pointing at an empty list (nil) as cdr. In the SECD machine, cell 0 always implicitly represents the empty list, so no special tag value is needed to signal an empty list (everything needing that can simply point to cell 0).
The principle that the cdr in a list cell must point at another list is just a convention. If both car and cdr point at atoms, that will yield a pair, usually written like "(1 . 2)"
A number of additional instructions for basic functions like car, cdr, list construction, integer addition, I/O, etc. exist. They all take any necessary parameters from the stack.
Virtual machine
A virtual machine is a "completely isolated guest operating system installation within a normal host operating system". Modern virtual machines are implemented with either software emulation or hardware virtualization or both together.-VM Definitions:A virtual machine is a software...
and abstract machine
Abstract machine
An abstract machine, also called an abstract computer, is a theoretical model of a computer hardware or software system used in automata theory...
intended as a target for functional programming language
Functional programming
In computer science, functional programming is a programming paradigm that treats computation as the evaluation of mathematical functions and avoids state and mutable data. It emphasizes the application of functions, in contrast to the imperative programming style, which emphasizes changes in state...
compilers. The letters stand for Stack, Environment, Code, Dump, the internal registers of the machine. These registers point to linked list
Linked list
In computer science, a linked list is a data structure consisting of a group of nodes which together represent a sequence. Under the simplest form, each node is composed of a datum and a reference to the next node in the sequence; more complex variants add additional links...
s in memory.
The machine was the first to be specifically designed to evaluate lambda calculus
Lambda calculus
In mathematical logic and computer science, lambda calculus, also written as λ-calculus, is a formal system for function definition, function application and recursion. The portion of lambda calculus relevant to computation is now called the untyped lambda calculus...
expressions. It was originally described by Peter J. Landin
Peter J. Landin
Peter John Landin was a British computer scientist. He was one of the first to realize that the lambda calculus could be used to model a programming language, an insight that is essential to development of both functional programming and denotational semantics.- Academic :Landin was born in...
as part of his ISWIM programming language definition in 1963. The description published by Landin was fairly abstract, and left many implementation choices open (like an operational semantics
Operational semantics
In computer science, operational semantics is a way to give meaning to computer programs in a mathematically rigorous way. Operational semantics are classified into two categories: structural operational semantics formally describe how the individual steps of a computation take place in a...
). Hence the SECD machine is often presented in a more detailed form, such as Peter Henderson's Lispkit Lisp
Lispkit Lisp
Lispkit Lisp is a lexically scoped, purely functional subset of Lisp developed as a testbed for functional programming concepts. It was first used for early experimentation with lazy evaluation. An SECD machine-based implementation written in an ALGOL variant was published by the developer Peter...
compiler, which has been distributed since 1980. Since then it has been used as the target for several other experimental compilers.
In 1989 researchers at the University of Calgary
University of Calgary
The University of Calgary is a public research university located in Calgary, Alberta, Canada. Founded in 1966 the U of C is composed of 14 faculties and more than 85 research institutes and centres.More than 25,000 undergraduate and 5,500 graduate students are currently...
worked on a hardware implementation of the machine.
Registers and memory
The SECD machine is stack-basedStack (data structure)
In computer science, a stack is a last in, first out abstract data type and linear data structure. A stack can have any abstract data type as an element, but is characterized by only three fundamental operations: push, pop and stack top. The push operation adds a new item to the top of the stack,...
. Functions take their arguments from the stack. The arguments to built-in instructions are encoded immediately after them in the instruction stream.
Like all internal data-structures, the stack is a list, with the S register pointing at the list's head or beginning. Due to the list structure, the stack need not be a continuous block of memory, so stack space is available as long as there is a single free memory cell. Even when all cells have been used, garbage collection
Garbage collection (computer science)
In computer science, garbage collection is a form of automatic memory management. The garbage collector, or just collector, attempts to reclaim garbage, or memory occupied by objects that are no longer in use by the program...
may yield additional free memory. Obviously, specific implementations of the SECD structure can implement the stack as a canonical stack structure, so improving the overall efficiency of the virtual machine, provided that a strict bound be put on the dimension of the stack.
The C register points at the head of the code or instruction list that will be evaluated. Once the instruction there has been executed, the C is pointed at the next instruction in the list—it is similar to an instruction pointer (or program counter
Program counter
The program counter , commonly called the instruction pointer in Intel x86 microprocessors, and sometimes called the instruction address register, or just part of the instruction sequencer in some computers, is a processor register that indicates where the computer is in its instruction sequence...
) in conventional machines, except that subsequent instructions are always specified during execution and are not by default contained in subsequent memory locations, as it is the case with the conventional machines.
The current variable environment is managed by the E register, which points at a list of lists. Each individual list represents one environment level: the parameters of the current function are in the head of the list, variables that are free in the current function, but bound by a surrounding function, are in other elements of E.
The dump, at whose head the D register points, is used as temporary storage for values of the other registers, for example during function calls. It can be likened to the return stack of other machines.
The memory organization of the SECD machine is similar to the model used by most functional language interpreters: a number of memory cells, each of which can hold either an atom (a simple value, for example 13), or represent an empty or non-empty list. In the latter case, the cell holds two pointers to other cells, one representing the first element, the other representing the list except for the first element. The two pointers are traditionally named car and cdr respectively—but the more modern terms head and tail are often used instead. The different types of values that a cell can hold are distinguished by a tag. Often different types of atoms (integers, strings, etc.) are distinguished as well.
So a list holding the numbers 1, 2, and 3, usually written as "(1 2 3)", could be represented as follows:
Address Tag Content (value for integers, car & cdr for lists)
9 [ integer | 2 ]
8 [ integer | 3 ]
7 [ list | 8 | 0 ]
6 [ list | 9 | 7 ]
...
2 [ list | 1 | 6 ]
1 [ integer | 1 ]
0 [ nil ]
The memory cells 3 to 5 do not belong to our list, the cells of which can be distributed randomly over the memory. Cell 2 is the head of the list, it points to cell 1 which holds the first element's value, and the list containing only 2 and 3 (beginning at cell 6). Cell 6 points at a cell holding 2 and at cell 7, which represents the list containing only 3. It does so by pointing at cell 8 containing the value 3, and pointing at an empty list (nil) as cdr. In the SECD machine, cell 0 always implicitly represents the empty list, so no special tag value is needed to signal an empty list (everything needing that can simply point to cell 0).
The principle that the cdr in a list cell must point at another list is just a convention. If both car and cdr point at atoms, that will yield a pair, usually written like "(1 . 2)"
Instructions
- nil pushes a nil pointer onto the stack
- ldc pushes a constant argument onto the stack
- ld pushes the value of a variable onto the stack. The variable is indicated by the argument, a pair. The pair's car specifies the level, the cdr the position. So "(1 . 3)" gives the current function's (level 1) third parameter.
- sel expects two list arguments, and pops a value from the stack. The first list is executed if the popped value was non-nil, the second list otherwise. Before one of these list pointers is made the new C, a pointer to the instruction following sel is saved on the dump.
- join pops a list reference from the dump and makes this the new value of C. This instruction occurs at the end of both alternatives of a sel.
- ldf takes one list argument representing a function. It constructs a closure (a pair containing the function and the current environment) and pushes that onto the stack.
- ap pops a closure and a list of parameter values from the stack. The closure is applied to the parameters by installing its environment as the current one, pushing the parameter list in front of that, clearing the stack, and setting C to the closure's function pointer. The previous values of S, E, and the next value of C are saved on the dump.
- ret pops one return value from the stack, restores S, E, and C from the dump, and pushes the return value onto the now-current stack.
- dum pushes a "dummy", an empty list, in front of the environment list.
- rap works like ap, only that it replaces an occurrence of a dummy environment with the current one, thus making recursive functions possible
A number of additional instructions for basic functions like car, cdr, list construction, integer addition, I/O, etc. exist. They all take any necessary parameters from the stack.
Further reading
- Danvy, OlivierOlivier DanvyOlivier Danvy is a French computer scientist specializing in programming languages, partial evaluation, and continuations at the University of Aarhus in Denmark.He is notable for the number of scientific papers which acknowledge his help...
. A Rational Deconstruction of Landin's SECD Machine. BRICS research report RS-04-30, 2004. ISSN 0909-0878 - Field, Anthony J. Field and Peter G. Harrison. 1988 Functional Programming. Addison-Wesley. ISBN 0-201-19249-7
- Graham, Brian T. 1992 "The SECD Microprocessor: A Verification Case Study". Springer. ISBN 0792392450
- Henderson, Peter. 1980 Functional Programming: Application and Implementation. Prentice Hall. ISBN 0-13-331579-7
- Kogge, Peter M. The Architecture of Symbolic Computers. ISBN 0-07-035596-7
- Landin, Peter J. 1964. The mechanical evaluation of expressions. Comput. J. 6, 4, 308-320.
- Landin, Peter J. 1966. The next 700 programming languages. Commun. ACM 9, 3, 157-166.