Scope (programming)
Encyclopedia
In computer programming
, scope is an enclosing context where values and expressions are associated. Various programming language
s have various types of scopes. The type of scope determines what kind of entities it can contain and how it affects them—or semantics
. Typically, scope is used to define the extent of information hiding
—that is, the visibility or accessibility of variables from different parts of the program. Scopes can:
A namespace is a scope that uses the enclosing nature of the scope to group logically related identifiers under a single identifier. Thus, scopes can affect the name resolution
for their contents.
Variables are associated with scopes. Different scoping rules affect how local variables are bound
. This has different consequences depending on whether the language has static (lexical) or dynamic scoping.
and has been picked up in most other languages since then. Static (lexical) scoping was also introduced in LISP 1.5 (via the Funarg device developed by Steve Russell
, working under John McCarthy
). The original Lisp interpreter (1960) and most early Lisps used dynamic scoping, but descendants of dynamically scoped languages often adopt static scoping; Common Lisp
has both dynamic and static scoping while Scheme uses static scoping exclusively. Perl
is another language with dynamic scoping that added static scoping afterwards. Languages like Pascal
and C
have always had lexical scoping, since they are both influenced by the ideas that went into ALGOL 60
(although C did not include lexically nested function
s).
sites is generally answered in one of two ways: lexical scoping and dynamic scoping.
by the language implementation. Because this matching only requires analysis of the static program text, this type of scoping is also called static scoping. Lexical scoping is standard in all ALGOL
-based languages such as Pascal
, Modula2 and Ada
as well as in modern functional languages such as ML and Haskell
. It is also used in the C language
and its syntactic and semantic relatives, although with different kinds of limitations. Static scoping allows the programmer to reason about object references such as parameters, variables, constants, types, functions, etc. as simple name substitutions. This makes it much easier to make modular code and reason about it, since the local naming structure can be understood in isolation. In contrast, dynamic scope forces the programmer to anticipate all possible dynamic contexts in which the module's code may be invoked.
For example, consider the following program fragment in Pascal:
The variable
There could have been another procedure
Correct implementation of static scope in languages with first-class
nested function
s is not trivial, as it requires each function value to carry with it a record of the values of the variables that it depends on (the pair of the function and this environment is called a closure
). Depending on implementation and computer architecture
, variable lookup
may become slightly inefficient when very deeply lexically nested
functions are used, although there are well known techniques to mitigate this. Also, for nested functions that only refer to their own arguments and (immediately) local variables, all relative locations can be known at compile time
. No overhead at all is therefore incurred when using that type of nested function. The same applies to particular parts of a program where nested functions are not used, and, naturally, to programs written in a language where nested functions are not available (such as in the C language).
of bindings. Introducing a local variable with name
leaves the scope. Evaluating
Generally, certain blocks are defined to create bindings whose lifetime is the execution time of the block; this adds some features of static scoping to the dynamic scoping process. However, since a section of code can be called from many different locations and situations, it can be difficult to determine at the outset what bindings will apply when a variable is used (or if one exists at all). This can be beneficial; application of the principle of least knowledge suggests that code avoid depending on the reasons for (or circumstances of) a variable's value, but simply use the value according to the variable's definition. This narrow interpretation of shared data can provide a very flexible system for adapting the behavior of a function to the current state (or policy) of the system. However, this benefit relies on careful documentation of all variables used this way as well as on careful avoidance of assumptions about a variable's behavior, and does not provide any memmer to choose static or dynamic scoping when defining or redefining a variable. Logo
and Emacs lisp
are examples of languages that use dynamic scoping.
Dynamic scoping is fairly easy to implement. To find an identifier's value, the program could traverse the runtime stack, checking each activation record (each function's stack frame) for a value for the identifier. In practice, this is made more efficient via the use of an association list
, which is a stack of name/value pairs. Pairs are pushed onto this stack whenever declarations are made, and popped whenever variables go out of scope. An alternate strategy that is considerably faster is to make use of a central reference table, which associates each name with its current meaning. This avoids a linear search during run-time to find a particular name, but maintaining this table is more complex. Note that both of these strategies assume a last-in-first-out (LIFO) ordering to bindings for any one variable; in practice all bindings are so ordered.
An even simpler implementation is the representation of dynamic variables with simple global variables. The local binding is performed by saving the original value in an anonymous location on the stack that is invisible to the program. When that binding scope terminates, the original value is restored from this location. In fact, dynamic scope originated in this manner. Early implementations of Lisp used this obvious strategy for implementing local variables, and the practice survives in some dialects which are still in use, such as GNU Emacs Lisp. Lexical scope was introduced into Lisp later. This is equivalent to the above central reference table algorithm, except that the central reference table is simply the global variable binding environment, in which the current meaning of the variable is its global value. Maintaining global variables isn't complex. For instance, a symbol object can have a dedicated slot for its global value.
Dynamic scoping provides an excellent abstraction for thread local storage, but if it is used that way it cannot be based on saving and restoring a global variable. A possible implementation strategy is for each variable to have a thread-local key. When the variable is accessed, the thread-local key is used to access the thread-local memory location (by code generated by the compiler, which knows which variables are dynamic and which are lexical). If the thread-local key does not exist for the calling thread, then the global location is used. When a variable is locally bound, the prior value is stored in a hidden location on the stack. The thread-local storage is created under the variable's key, and the new value is stored there. Further nested overrides of the variable within that thread simply save and restore this thread-local location. When the initial, outer-most override's scope terminates, the thread-local key is deleted, exposing the global version of the variable once again to that thread.
With static (lexical) scoping, calling g will return 0 since it has been determined at compile time that the expression x in any invocation of f will yield the global x binding which is unaffected by the introduction of a local variable of the same name in g.
With dynamic scoping, the binding stack for the x identifier will contain two items when f is invoked from g: the global binding to 0, and the binding to 1 introduced in g (which is still present on the stack since the control flow hasn't left g yet). Since evaluating the identifier expression by definition always yields the top binding, the result is 1.
In the language Perl
, variables can be defined with either static or dynamic scoping. Perl's keyword "
The example above uses "
In this alternative, "
In other words, the dynamically-scoped variable $x is resolved in the environment of execution, rather than the environment of definition.
Computer programming
Computer programming is the process of designing, writing, testing, debugging, and maintaining the source code of computer programs. This source code is written in one or more programming languages. The purpose of programming is to create a program that performs specific operations or exhibits a...
, scope is an enclosing context where values and expressions are associated. Various programming language
Programming language
A programming language is an artificial language designed to communicate instructions to a machine, particularly a computer. Programming languages can be used to create programs that control the behavior of a machine and/or to express algorithms precisely....
s have various types of scopes. The type of scope determines what kind of entities it can contain and how it affects them—or semantics
Formal semantics of programming languages
In programming language theory, semantics is the field concerned with the rigorous mathematical study of the meaning of programming languages and models of computation...
. Typically, scope is used to define the extent of information hiding
Information hiding
In computer science, information hiding is the principle of segregation of the design decisions in a computer program that are most likely to change, thus protecting other parts of the program from extensive modification if the design decision is changed...
—that is, the visibility or accessibility of variables from different parts of the program. Scopes can:
- contain declarationsDeclaration (computer science)In programming languages, a declaration specifies the identifier, type, and other aspects of language elements such as variables and functions. It is used to announce the existence of the element to the compiler; this is important in many strongly-typed languages that require variables and their...
or definitions of identifierIdentifierAn identifier is a name that identifies either a unique object or a unique class of objects, where the "object" or class may be an idea, physical [countable] object , or physical [noncountable] substance...
s; - contain statementStatement (programming)In computer programming a statement can be thought of as the smallest standalone element of an imperative programming language. A program written in such a language is formed by a sequence of one or more statements. A statement will have internal components .Many languages In computer programming...
s and/or expressionExpression (programming)An expression in a programming language is a combination of explicit values, constants, variables, operators, and functions that are interpreted according to the particular rules of precedence and of association for a particular programming language, which computes and then produces another value...
s which define an executable algorithmAlgorithmIn mathematics and computer science, an algorithm is an effective method expressed as a finite list of well-defined instructions for calculating a function. Algorithms are used for calculation, data processing, and automated reasoning...
or part thereof; - nest or be nested.
A namespace is a scope that uses the enclosing nature of the scope to group logically related identifiers under a single identifier. Thus, scopes can affect the name resolution
Name resolution
-In computer languages:Expressions in computer languages can contain identifiers. The semantics of such expressions depend on the entities that the identifiers refer to. The algorithm that determines what an identifier in a given context refers to is part of the language definition.The complexity...
for their contents.
Variables are associated with scopes. Different scoping rules affect how local variables are bound
Name binding
In programming languages, name binding is the association of objects with identifiers. An identifier bound to an object is said to reference that object. Machine languages have no built-in notion of identifiers, but name-object bindings as a service and notation for the programmer is implemented...
. This has different consequences depending on whether the language has static (lexical) or dynamic scoping.
History
Lexical scoping was used for ALGOLALGOL
ALGOL is a family of imperative computer programming languages originally developed in the mid 1950s which greatly influenced many other languages and became the de facto way algorithms were described in textbooks and academic works for almost the next 30 years...
and has been picked up in most other languages since then. Static (lexical) scoping was also introduced in LISP 1.5 (via the Funarg device developed by Steve Russell
Steve Russell
Steve "Slug" Russell is a programmer and computer scientist most famous for creating Spacewar!, one of the earliest videogames, in 1961 with the fellow members of the Tech Model Railroad Club at MIT working on a DEC Digital PDP-1...
, working under John McCarthy
John McCarthy (computer scientist)
John McCarthy was an American computer scientist and cognitive scientist. He coined the term "artificial intelligence" , invented the Lisp programming language and was highly influential in the early development of AI.McCarthy also influenced other areas of computing such as time sharing systems...
). The original Lisp interpreter (1960) and most early Lisps used dynamic scoping, but descendants of dynamically scoped languages often adopt static scoping; Common Lisp
Common Lisp
Common Lisp, commonly abbreviated CL, is a dialect of the Lisp programming language, published in ANSI standard document ANSI INCITS 226-1994 , . From the ANSI Common Lisp standard the Common Lisp HyperSpec has been derived for use with web browsers...
has both dynamic and static scoping while Scheme uses static scoping exclusively. Perl
Perl
Perl is a high-level, general-purpose, interpreted, dynamic programming language. Perl was originally developed by Larry Wall in 1987 as a general-purpose Unix scripting language to make report processing easier. Since then, it has undergone many changes and revisions and become widely popular...
is another language with dynamic scoping that added static scoping afterwards. Languages like Pascal
Pascal (programming language)
Pascal is an influential imperative and procedural programming language, designed in 1968/9 and published in 1970 by Niklaus Wirth as a small and efficient language intended to encourage good programming practices using structured programming and data structuring.A derivative known as Object Pascal...
and C
C (programming language)
C is a general-purpose computer programming language developed between 1969 and 1973 by Dennis Ritchie at the Bell Telephone Laboratories for use with the Unix operating system....
have always had lexical scoping, since they are both influenced by the ideas that went into ALGOL 60
ALGOL 60
ALGOL 60 is a member of the ALGOL family of computer programming languages. It gave rise to many other programming languages, including BCPL, B, Pascal, Simula, C, and many others. ALGOL 58 introduced code blocks and the begin and end pairs for delimiting them...
(although C did not include lexically nested function
Nested function
In computer programming, a nested function is a function which is lexically encapsulated within another function. It can only be called by the enclosing function or by functions directly or indirectly nested within the same enclosing function. In other words, the scope of the nested function is...
s).
Example
The following example shows various scopes declared in the language C#:Lexical versus dynamic scoping
One of the basic reasons for scoping is to keep variables in different parts of the program distinct from one another. Since there are only a small number of short variable names, and programmers share habits about the naming of variables (e.g.,i
for an array index), in any program of moderate size the same variable name will be used in multiple different scopes. The question of how to match various variable occurrences to the appropriate bindingName binding
In programming languages, name binding is the association of objects with identifiers. An identifier bound to an object is said to reference that object. Machine languages have no built-in notion of identifiers, but name-object bindings as a service and notation for the programmer is implemented...
sites is generally answered in one of two ways: lexical scoping and dynamic scoping.
Lexical scoping
With lexical scope, a name always refers to its (more or less) local lexical environment. This is a property of the program text and is made independent of the runtime call stackCall stack
In computer science, a call stack is a stack data structure that stores information about the active subroutines of a computer program. This kind of stack is also known as an execution stack, control stack, run-time stack, or machine stack, and is often shortened to just "the stack"...
by the language implementation. Because this matching only requires analysis of the static program text, this type of scoping is also called static scoping. Lexical scoping is standard in all ALGOL
ALGOL
ALGOL is a family of imperative computer programming languages originally developed in the mid 1950s which greatly influenced many other languages and became the de facto way algorithms were described in textbooks and academic works for almost the next 30 years...
-based languages such as Pascal
Pascal (programming language)
Pascal is an influential imperative and procedural programming language, designed in 1968/9 and published in 1970 by Niklaus Wirth as a small and efficient language intended to encourage good programming practices using structured programming and data structuring.A derivative known as Object Pascal...
, Modula2 and Ada
Ada (programming language)
Ada is a structured, statically typed, imperative, wide-spectrum, and object-oriented high-level computer programming language, extended from Pascal and other languages...
as well as in modern functional languages such as ML and Haskell
Haskell (programming language)
Haskell is a standardized, general-purpose purely functional programming language, with non-strict semantics and strong static typing. It is named after logician Haskell Curry. In Haskell, "a function is a first-class citizen" of the programming language. As a functional programming language, the...
. It is also used in the C language
C (programming language)
C is a general-purpose computer programming language developed between 1969 and 1973 by Dennis Ritchie at the Bell Telephone Laboratories for use with the Unix operating system....
and its syntactic and semantic relatives, although with different kinds of limitations. Static scoping allows the programmer to reason about object references such as parameters, variables, constants, types, functions, etc. as simple name substitutions. This makes it much easier to make modular code and reason about it, since the local naming structure can be understood in isolation. In contrast, dynamic scope forces the programmer to anticipate all possible dynamic contexts in which the module's code may be invoked.
For example, consider the following program fragment in Pascal:
The variable
I
is visible at all points, because it is never hidden by another variable of the same name. The char
variable K
is visible only in the main program because it is hidden by the real
variable K
visible in procedure B
and C
only. Variable L
is also visible only in procedure B
and C
but it does not hide any other variable. Variable M
is only visible in procedure C
and therefore not accessible either from procedure B
or the main program. Also, procedure C
is visible only in procedure B
and can therefore not be called from the main program.There could have been another procedure
C
declared in the program outside of procedure B
. The place in the program where "C
" is mentioned then determines which of the two procedures named C
it represents, thus precisely analogous with the scope of variables.Correct implementation of static scope in languages with first-class
First-class function
In computer science, a programming language is said to have first-class functions if it treats functions as first-class objects. Specifically, this means that the language supports passing functions as arguments to other functions, returning them as the values from other functions, and assigning...
nested function
Nested function
In computer programming, a nested function is a function which is lexically encapsulated within another function. It can only be called by the enclosing function or by functions directly or indirectly nested within the same enclosing function. In other words, the scope of the nested function is...
s is not trivial, as it requires each function value to carry with it a record of the values of the variables that it depends on (the pair of the function and this environment is called a closure
Closure (computer science)
In computer science, a closure is a function together with a referencing environment for the non-local variables of that function. A closure allows a function to access variables outside its typical scope. Such a function is said to be "closed over" its free variables...
). Depending on implementation and computer architecture
Computer architecture
In computer science and engineering, computer architecture is the practical art of selecting and interconnecting hardware components to create computers that meet functional, performance and cost goals and the formal modelling of those systems....
, variable lookup
Lookup
In computing, lookup usually refers to searching a data structure for an item that satisfies some specified property. For example, variable lookup performed by a language interpreter, virtual machine or other similar engine usually consists of performing certain...
may become slightly inefficient when very deeply lexically nested
Nesting (computing)
In computing science and informatics, the word nesting may denote several different constructions and activities where information is organized in layers or objects contain other similar objects. The rather general term is thus used in quite specific ways for various situations and concepts...
functions are used, although there are well known techniques to mitigate this. Also, for nested functions that only refer to their own arguments and (immediately) local variables, all relative locations can be known at compile time
Compile time
In computer science, compile time refers to either the operations performed by a compiler , programming language requirements that must be met by source code for it to be successfully compiled , or properties of the program that can be reasoned about at compile time.The operations performed at...
. No overhead at all is therefore incurred when using that type of nested function. The same applies to particular parts of a program where nested functions are not used, and, naturally, to programs written in a language where nested functions are not available (such as in the C language).
Dynamic scoping
With dynamic scope, each identifier has a global stackStack (data structure)
In computer science, a stack is a last in, first out abstract data type and linear data structure. A stack can have any abstract data type as an element, but is characterized by only three fundamental operations: push, pop and stack top. The push operation adds a new item to the top of the stack,...
of bindings. Introducing a local variable with name
x
pushes a binding onto the global x
stack (which may have been empty), which is popped off when the control flowControl flow
In computer science, control flow refers to the order in which the individual statements, instructions, or function calls of an imperative or a declarative program are executed or evaluated....
leaves the scope. Evaluating
x
in any context always yields the top binding. In other words, a global identifier refers to the identifier associated with the most recent environment. Note that this cannot be done at compile-time because the binding stack only exists at run-time, which is why this type of scoping is called dynamic scoping.Generally, certain blocks are defined to create bindings whose lifetime is the execution time of the block; this adds some features of static scoping to the dynamic scoping process. However, since a section of code can be called from many different locations and situations, it can be difficult to determine at the outset what bindings will apply when a variable is used (or if one exists at all). This can be beneficial; application of the principle of least knowledge suggests that code avoid depending on the reasons for (or circumstances of) a variable's value, but simply use the value according to the variable's definition. This narrow interpretation of shared data can provide a very flexible system for adapting the behavior of a function to the current state (or policy) of the system. However, this benefit relies on careful documentation of all variables used this way as well as on careful avoidance of assumptions about a variable's behavior, and does not provide any memmer to choose static or dynamic scoping when defining or redefining a variable. Logo
Logo (programming language)
Logo is a multi-paradigm computer programming language used in education. It is an adaptation and dialect of the Lisp language; some have called it Lisp without the parentheses. It was originally conceived and written as functional programming language, and drove a mechanical turtle as an output...
and Emacs lisp
Emacs Lisp
Emacs Lisp is a dialect of the Lisp programming language used by the GNU Emacs and XEmacs text editors . It is used for implementing most of the editing functionality built into Emacs, the remainder being written in C...
are examples of languages that use dynamic scoping.
Dynamic scoping is fairly easy to implement. To find an identifier's value, the program could traverse the runtime stack, checking each activation record (each function's stack frame) for a value for the identifier. In practice, this is made more efficient via the use of an association list
Association list
In computer programming and particularly in Lisp, an association list, often referred to as an alist, is a linked list in which each list element comprises a key and a value. The association list is said to associate the value with the key...
, which is a stack of name/value pairs. Pairs are pushed onto this stack whenever declarations are made, and popped whenever variables go out of scope. An alternate strategy that is considerably faster is to make use of a central reference table, which associates each name with its current meaning. This avoids a linear search during run-time to find a particular name, but maintaining this table is more complex. Note that both of these strategies assume a last-in-first-out (LIFO) ordering to bindings for any one variable; in practice all bindings are so ordered.
An even simpler implementation is the representation of dynamic variables with simple global variables. The local binding is performed by saving the original value in an anonymous location on the stack that is invisible to the program. When that binding scope terminates, the original value is restored from this location. In fact, dynamic scope originated in this manner. Early implementations of Lisp used this obvious strategy for implementing local variables, and the practice survives in some dialects which are still in use, such as GNU Emacs Lisp. Lexical scope was introduced into Lisp later. This is equivalent to the above central reference table algorithm, except that the central reference table is simply the global variable binding environment, in which the current meaning of the variable is its global value. Maintaining global variables isn't complex. For instance, a symbol object can have a dedicated slot for its global value.
Dynamic scoping provides an excellent abstraction for thread local storage, but if it is used that way it cannot be based on saving and restoring a global variable. A possible implementation strategy is for each variable to have a thread-local key. When the variable is accessed, the thread-local key is used to access the thread-local memory location (by code generated by the compiler, which knows which variables are dynamic and which are lexical). If the thread-local key does not exist for the calling thread, then the global location is used. When a variable is locally bound, the prior value is stored in a hidden location on the stack. The thread-local storage is created under the variable's key, and the new value is stored there. Further nested overrides of the variable within that thread simply save and restore this thread-local location. When the initial, outer-most override's scope terminates, the thread-local key is deleted, exposing the global version of the variable once again to that thread.
Example
This example compares the consequences of using static scope and dynamic scope. Observe the following code, in a C-like language:With static (lexical) scoping, calling g will return 0 since it has been determined at compile time that the expression x in any invocation of f will yield the global x binding which is unaffected by the introduction of a local variable of the same name in g.
With dynamic scoping, the binding stack for the x identifier will contain two items when f is invoked from g: the global binding to 0, and the binding to 1 introduced in g (which is still present on the stack since the control flow hasn't left g yet). Since evaluating the identifier expression by definition always yields the top binding, the result is 1.
In the language Perl
Perl
Perl is a high-level, general-purpose, interpreted, dynamic programming language. Perl was originally developed by Larry Wall in 1987 as a general-purpose Unix scripting language to make report processing easier. Since then, it has undergone many changes and revisions and become widely popular...
, variables can be defined with either static or dynamic scoping. Perl's keyword "
my
" defines a statically scoped local variable, while the keyword "local
" defines a dynamically scoped local variable. This allows for further clarification with practical examples of each scoping model.The example above uses "
my
" for static scoping of g's local variable $x. As above, calling g returns 0 because f cannot see g's variable $x, so it looks for the global $x.In this alternative, "
local
" is used to make g's $x dynamically-scoped. Now, calling g yields 1 because f sees g's local variable by looking down the execution stack.In other words, the dynamically-scoped variable $x is resolved in the environment of execution, rather than the environment of definition.
See also
- Closure (computer science)Closure (computer science)In computer science, a closure is a function together with a referencing environment for the non-local variables of that function. A closure allows a function to access variables outside its typical scope. Such a function is said to be "closed over" its free variables...
- Global variableGlobal variableIn computer programming, a global variable is a variable that is accessible in every scope . Interaction mechanisms with global variables are called global environment mechanisms...
- Local variableLocal variableIn computer science, a local variable is a variable that is given local scope. Such a variable is accessible only from the function or block in which it is declared. In programming languages with only two levels of visibility, local variables are contrasted with global variables...
- Non-local variableNon-local variableIn programming language theory, a non-local variable is a variable that is not defined in the local scope. While the term can refer to global variables, it is primarily used in the context of nested and anonymous functions where some variables can be neither in the local nor the global scope.-...
- Name bindingName bindingIn programming languages, name binding is the association of objects with identifiers. An identifier bound to an object is said to reference that object. Machine languages have no built-in notion of identifiers, but name-object bindings as a service and notation for the programmer is implemented...
- Name resolutionName resolution-In computer languages:Expressions in computer languages can contain identifiers. The semantics of such expressions depend on the entities that the identifiers refer to. The algorithm that determines what an identifier in a given context refers to is part of the language definition.The complexity...
- Variables (scope and extent)
- Information hidingInformation hidingIn computer science, information hiding is the principle of segregation of the design decisions in a computer program that are most likely to change, thus protecting other parts of the program from extensive modification if the design decision is changed...
Further reading
- Harold Abelson and Gerald Jay SussmanGerald Jay SussmanGerald Jay Sussman is the Panasonic Professor of Electrical Engineering at the Massachusetts Institute of Technology . He received his S.B. and Ph.D. degrees in mathematics from MIT in 1968 and 1973 respectively. He has been involved in artificial intelligence research at MIT since 1964...
. "Lexical addressing". Structure and Interpretation of Computer ProgramsStructure and Interpretation of Computer ProgramsStructure and Interpretation of Computer Programs is a textbook published in 1984 about general computer programming concepts from MIT Press written by Massachusetts Institute of Technology professors Harold Abelson and Gerald Jay Sussman, with Julie Sussman...
. - Scott, M. (2006). Programming Language Pragmatics, 2nd edition. Morgan Kauffman Publishers, San Francisco, CA.