Thunk (functional programming)
Encyclopedia
In computer science
, a thunk (also suspension, suspended computation or delayed computation) is a parameterless
closure
created to prevent the evaluation of an expression until forced at a later time. In lazy
languages thunks are created and forced implicitly. In eager languages thunks can be created explicitly by wrapping an expression in a parameterless anonymous function
and forced by calling it in order to emulate lazy evaluation. Some functional languages require functions to take at least one parameter, requiring the anonymous function to take a dummy parameter, usually of unit type
, instead.
typically use a thunk to refer to an incompletely evaluated expression. In this context, the thunk is simply a placeholder for a part of computation which, when executed, returns either the fully evaluated expression, or returns a tag, which shows that there is a part of the expression which is yet to be instantiated, in which case blind, mindless computation cannot yet be performed (because, for example, some variable in the above-mentioned algebraic formula is not yet mapped to a number — hence the formula does not yet reduce to a concrete number).
), the "computation" performed by the thunk may include a lookup in the run-time context of the program to determine the current binding of a variable.
Thunks were invented by Peter Zilahy Ingerman in 1961. According to the inventor, the name "thunk" came about because it could be optimized by the compiler by "thinking about it", so "thunk", according to its inventor, "is the past tense of 'think' at two in the morning"
An early implementation of thunks for call-by-name was in Algol 60
.
begin
integer idx;
real procedure sum (i, lo, hi, term);
value lo, hi;
integer i, lo, hi;
real term;
comment term is passed by-name, and so is i;
begin
real temp;
temp := 0;
for i := lo step 1 until hi do
temp := temp + term;
sum := temp
end;
print (sum (idx, 1, 100, 1/idx))
end
The above example (see Jensen's Device
) relies on the fact that the actual parameters
begin
integer idx;
real sum;
begin
real temp;
temp := 0;
for idx := 1 step 1 until 100 do
temp := temp + 1/idx;
sum := temp
end;
print (sum)
end
Notice that the expression
using thunks to implement call-by-name would process the original code as if it had been written using function pointers or lambdas (represented below in Algol-like pseudocode):
begin
integer idx;
real procedure sum (i_lvalue, lo, hi, term_rvalue);
value lo, hi;
integer lo, hi;
thunk i_lvalue;
thunk term_rvalue;
begin
real temp;
temp := 0;
for i_lvalue := lo step 1 until hi do
temp := temp + term_rvalue;
sum := temp
end;
procedure lvalue_of_idx ;
begin
lvalue_of_idx := address of idx
end;
procedure rvalue_of_1_over_idx ;
begin
rvalue_of_1_over_idx := 1/idx
end;
print (sum (lvalue_of_idx, 1, 100, rvalue_of_1_over_idx))
end
The procedures
Computer science
Computer science or computing science is the study of the theoretical foundations of information and computation and of practical techniques for their implementation and application in computer systems...
, a thunk (also suspension, suspended computation or delayed computation) is a parameterless
Parameter (computer science)
In computer programming, a parameter is a special kind of variable, used in a subroutine to refer to one of the pieces of data provided as input to the subroutine. These pieces of data are called arguments...
closure
Closure (computer science)
In computer science, a closure is a function together with a referencing environment for the non-local variables of that function. A closure allows a function to access variables outside its typical scope. Such a function is said to be "closed over" its free variables...
created to prevent the evaluation of an expression until forced at a later time. In lazy
Lazy evaluation
In programming language theory, lazy evaluation or call-by-need is an evaluation strategy which delays the evaluation of an expression until the value of this is actually required and which also avoids repeated evaluations...
languages thunks are created and forced implicitly. In eager languages thunks can be created explicitly by wrapping an expression in a parameterless anonymous function
Anonymous function
In programming language theory, an anonymous function is a function defined, and possibly called, without being bound to an identifier. Anonymous functions are convenient to pass as an argument to a higher-order function and are ubiquitous in languages with first-class functions such as Haskell...
and forced by calling it in order to emulate lazy evaluation. Some functional languages require functions to take at least one parameter, requiring the anonymous function to take a dummy parameter, usually of unit type
Unit type
In the area of mathematical logic, and computer science known as type theory, a unit type is a type that allows only one value . The carrier associated with a unit type can be any singleton set. There is an isomorphism between any two such sets, so it is customary to talk about the unit type and...
, instead.
Call by name
Implementations of the call by name and call by need evaluation strategiesEvaluation strategy
In computer science, an evaluation strategy is a set of rules for evaluating expressions in a programming language. Emphasis is typically placed on functions or operators: an evaluation strategy defines when and in what order the arguments to a function are evaluated, when they are substituted...
typically use a thunk to refer to an incompletely evaluated expression. In this context, the thunk is simply a placeholder for a part of computation which, when executed, returns either the fully evaluated expression, or returns a tag, which shows that there is a part of the expression which is yet to be instantiated, in which case blind, mindless computation cannot yet be performed (because, for example, some variable in the above-mentioned algebraic formula is not yet mapped to a number — hence the formula does not yet reduce to a concrete number).
Call by need
In call-by-need, the thunk is replaced by its return value after its first complete execution. In languages with late binding (for example, HaskellHaskell (programming language)
Haskell is a standardized, general-purpose purely functional programming language, with non-strict semantics and strong static typing. It is named after logician Haskell Curry. In Haskell, "a function is a first-class citizen" of the programming language. As a functional programming language, the...
), the "computation" performed by the thunk may include a lookup in the run-time context of the program to determine the current binding of a variable.
Thunks were invented by Peter Zilahy Ingerman in 1961. According to the inventor, the name "thunk" came about because it could be optimized by the compiler by "thinking about it", so "thunk", according to its inventor, "is the past tense of 'think' at two in the morning"
An early implementation of thunks for call-by-name was in Algol 60
ALGOL 60
ALGOL 60 is a member of the ALGOL family of computer programming languages. It gave rise to many other programming languages, including BCPL, B, Pascal, Simula, C, and many others. ALGOL 58 introduced code blocks and the begin and end pairs for delimiting them...
.
begin
integer idx;
real procedure sum (i, lo, hi, term);
value lo, hi;
integer i, lo, hi;
real term;
comment term is passed by-name, and so is i;
begin
real temp;
temp := 0;
for i := lo step 1 until hi do
temp := temp + term;
sum := temp
end;
print (sum (idx, 1, 100, 1/idx))
end
The above example (see Jensen's Device
Jensen's Device
Jensen's Device is a computer programming technique devised by Danish computer scientist Jørn Jensen, who worked with Peter Naur at Regnecentralen, particularly on the GIER Algol compiler, one of the earliest correct implementations of ALGOL 60....
) relies on the fact that the actual parameters
idx
and 1/idx
are passed "by name", so that the program is equivalent to the "inlined" versionbegin
integer idx;
real sum;
begin
real temp;
temp := 0;
for idx := 1 step 1 until 100 do
temp := temp + 1/idx;
sum := temp
end;
print (sum)
end
Notice that the expression
1/idx
is not evaluated at the point of the call to sum
; instead, it is evaluated anew each time the formal parameter term
is mentioned in the definition of sum
. A compilerCompiler
A compiler is a computer program that transforms source code written in a programming language into another computer language...
using thunks to implement call-by-name would process the original code as if it had been written using function pointers or lambdas (represented below in Algol-like pseudocode):
begin
integer idx;
real procedure sum (i_lvalue, lo, hi, term_rvalue);
value lo, hi;
integer lo, hi;
thunk i_lvalue;
thunk term_rvalue;
begin
real temp;
temp := 0;
for i_lvalue := lo step 1 until hi do
temp := temp + term_rvalue;
sum := temp
end;
procedure lvalue_of_idx ;
begin
lvalue_of_idx := address of idx
end;
procedure rvalue_of_1_over_idx ;
begin
rvalue_of_1_over_idx := 1/idx
end;
print (sum (lvalue_of_idx, 1, 100, rvalue_of_1_over_idx))
end
The procedures
lvalue_of_idx
and rvalue_of_1_over_idx
would be generated automatically by the compiler whenever a call-by-name actual parameter was encountered. These automatically generated procedures would be called thunks.