List (computing)
Encyclopedia
In computer science
, a list or sequence is an abstract data structure that implements an ordered collection of values
, where the same value may occur more than once. An instance of a list is a computer representation of the mathematical
concept of a finite sequence, that is, a tuple
. Each instance of a value in the list is usually called an item, entry, or element of the list; if the same value occurs multiple times, each occurrence is considered a distinct item.
The name list is also used for several concrete data structures that can be used to implement abstract lists, especially linked list
s.
The so-called static list structures allow only inspection and enumeration of the values. A mutable or dynamic list may allow items to be inserted, replaced, or deleted during the list's existence.
Many programming language
s provide support for list data types, and have special syntax and semantics for lists and list operations. A list can often be constructed by writing the items in sequence, separated by comma
s, semicolon
s, or space
s, within a pair of delimiters such as parentheses '', brackets, '[]', braces '{}', or angle brackets '<>'. Some languages may allow list types to be indexed or sliced
like array type
s. In object-oriented programming language
s, lists are usually provided as instance
s of subclasses of a generic "list" class. List data types are often implemented using arrays or linked lists of some sort, but other data structures may be more appropriate for some applications. In some contexts, such as in Lisp
programming, the term list may refer specifically to a linked list rather than an array.
In type theory
and functional programming
, abstract lists are usually defined inductively by four operations: nil that yelds the empty list, cons, which adds an item at the beginning of a list, first, that returns the first element of a list, and rest that returns a list minus its first element. Formally, Peano's natural numbers can be defined as abstract lists with elements of unit type
.
:
s (either singly or doubly linked) or as arrays, usually variable length or dynamic array
s.
The standard way of implementing lists, originating with the programming language Lisp, is to have each element of the list contain both its value and a pointer indicating the location of the next element in the list. This results in either a linked list
or a tree, depending on whether the list has nested sublists. Some older Lisp implementations (such as the Lisp implementation of the Symbolics
3600) also supported "compressed lists" (using CDR coding
) which had a special internal representation (invisible to the user). Lists can be manipulated using iteration
or recursion
. The former is often preferred in imperative programming languages
, while the latter is the norm in functional languages.
Lists can be implemented as self-balancing binary search tree
s holding index-value pairs, providing equal-time access to any element (e.g. all residing in the fringe, and internal nodes storing the right-most child's index, used to guide the search), taking the time logarithmic in the list's size, but as long as it doesn't change much will provide the illusion of random access
and enable swap, prefix and append operations in logarithmic time as well.
, but offer the use of associative array
s or some kind of table to emulate lists. For example, Lua
provides tables. Although Lua stores lists that have numerical indices as arrays internally, they still appear as hash tables.
In Lisp
, lists are the fundamental data type and can represent both program code and data. In most dialects, the list of the first three prime numbers could be written as
Because in computing, lists are easier to realize than sets, a finite set in mathematical sense can be realized as a list with additional restrictions, that is, duplicate elements are disallowed and such that order is irrelevant. If the list is sorted, it speeds up determining if a given item is already in the set but in order to ensure the order, it requires more time to add new entry to the list. In efficient implementations, however, sets are implemented using self-balancing binary search tree
s or hash table
s, rather than a list.
list) is defined by the following functions:
with the axioms
for any element e and any list l. It is implicit that
Note that first (nil ) and rest (nil ) are not defined.
These axioms are equivalent to those of the abstract stack data type.
In type theory
, the above definition is more simply regarded as an inductive type defined in terms of constructors: nil and cons. In algebraic terms, this can be represented as the transformation 1 + E × L → L. first and rest are then obtained by pattern matching
on the cons constructor and separately handling the nil case.
where append is defined as:
Alternatively, the monad may be defined in terms of operations return, fmap and join, with:
Note that fmap, join, append and bind are well-defined, since they're applied to progressively deeper arguments at each recursive call.
The list type is an additive monad, with nil as the monadic zero and append as monadic sum.
Lists form a monoid
under the append operation. The identity element of the monoid is the empty list, nil. In fact, this is the free monoid over the set of list elements.
Computer science
Computer science or computing science is the study of the theoretical foundations of information and computation and of practical techniques for their implementation and application in computer systems...
, a list or sequence is an abstract data structure that implements an ordered collection of values
Value (computer science)
In computer science, a value is an expression which cannot be evaluated any further . The members of a type are the values of that type. For example, the expression "1 + 2" is not a value as it can be reduced to the expression "3"...
, where the same value may occur more than once. An instance of a list is a computer representation of the mathematical
Mathematics
Mathematics is the study of quantity, space, structure, and change. Mathematicians seek out patterns and formulate new conjectures. Mathematicians resolve the truth or falsity of conjectures by mathematical proofs, which are arguments sufficient to convince other mathematicians of their validity...
concept of a finite sequence, that is, a tuple
Tuple
In mathematics and computer science, a tuple is an ordered list of elements. In set theory, an n-tuple is a sequence of n elements, where n is a positive integer. There is also one 0-tuple, an empty sequence. An n-tuple is defined inductively using the construction of an ordered pair...
. Each instance of a value in the list is usually called an item, entry, or element of the list; if the same value occurs multiple times, each occurrence is considered a distinct item.
The name list is also used for several concrete data structures that can be used to implement abstract lists, especially linked list
Linked list
In computer science, a linked list is a data structure consisting of a group of nodes which together represent a sequence. Under the simplest form, each node is composed of a datum and a reference to the next node in the sequence; more complex variants add additional links...
s.
The so-called static list structures allow only inspection and enumeration of the values. A mutable or dynamic list may allow items to be inserted, replaced, or deleted during the list's existence.
Many programming language
Programming language
A programming language is an artificial language designed to communicate instructions to a machine, particularly a computer. Programming languages can be used to create programs that control the behavior of a machine and/or to express algorithms precisely....
s provide support for list data types, and have special syntax and semantics for lists and list operations. A list can often be constructed by writing the items in sequence, separated by comma
Comma
A comma is a type of punctuation mark . The word comes from the Greek komma , which means something cut off or a short clause.Comma may also refer to:* Comma , a type of interval in music theory...
s, semicolon
Semicolon
The semicolon is a punctuation mark with several uses. The Italian printer Aldus Manutius the Elder established the practice of using the semicolon to separate words of opposed meaning and to indicate interdependent statements. "The first printed semicolon was the work of ... Aldus Manutius"...
s, or space
Space
Space is the boundless, three-dimensional extent in which objects and events occur and have relative position and direction. Physical space is often conceived in three linear dimensions, although modern physicists usually consider it, with time, to be part of a boundless four-dimensional continuum...
s, within a pair of delimiters such as parentheses '', brackets, '[]', braces '{}', or angle brackets '<>'. Some languages may allow list types to be indexed or sliced
Array slicing
In computer programming, array slicing is an operation that extracts certain elements from an array and packages them as another array, possibly with different number of indices and different index ranges. Two common examples are extracting a substring from a string of characters In computer...
like array type
Array data type
In computer science, an array type is a data type that is meant to describe a collection of elements , each selected by one or more indices that can be computed at run time by the program. Such a collection is usually called an array variable, array value, or simply array...
s. In object-oriented programming language
Object-oriented programming language
This is a list of object-oriented programming programming languages.-Languages with object-oriented features:*ABAP*Ada 95*AmigaE*BETA*Blue*Boo*C++*C#*COBOL*Cobra*ColdFusion*Common Lisp*COOL*CorbaScript*Clarion*CLU*Curl*D*Dylan*E*Eiffel...
s, lists are usually provided as instance
Instance (computer science)
In object-oriented programming an instance is an occurrence or a copy of an object, whether currently executing or not. Instances of a class share the same set of attributes, yet will typically differ in what those attributes contain....
s of subclasses of a generic "list" class. List data types are often implemented using arrays or linked lists of some sort, but other data structures may be more appropriate for some applications. In some contexts, such as in Lisp
Lisp programming language
Lisp is a family of computer programming languages with a long history and a distinctive, fully parenthesized syntax. Originally specified in 1958, Lisp is the second-oldest high-level programming language in widespread use today; only Fortran is older...
programming, the term list may refer specifically to a linked list rather than an array.
In type theory
Type theory
In mathematics, logic and computer science, type theory is any of several formal systems that can serve as alternatives to naive set theory, or the study of such formalisms in general...
and functional programming
Functional programming
In computer science, functional programming is a programming paradigm that treats computation as the evaluation of mathematical functions and avoids state and mutable data. It emphasizes the application of functions, in contrast to the imperative programming style, which emphasizes changes in state...
, abstract lists are usually defined inductively by four operations: nil that yelds the empty list, cons, which adds an item at the beginning of a list, first, that returns the first element of a list, and rest that returns a list minus its first element. Formally, Peano's natural numbers can be defined as abstract lists with elements of unit type
Unit type
In the area of mathematical logic, and computer science known as type theory, a unit type is a type that allows only one value . The carrier associated with a unit type can be any singleton set. There is an isomorphism between any two such sets, so it is customary to talk about the unit type and...
.
Operations
Implementation of the list data structure may provide some of the following operationsOperation (mathematics)
The general operation as explained on this page should not be confused with the more specific operators on vector spaces. For a notion in elementary mathematics, see arithmetic operation....
:
- a constructorConstructor (computer science)In object-oriented programming, a constructor in a class is a special type of subroutine called at the creation of an object. It prepares the new object for use, often accepting parameters which the constructor uses to set any member variables required when the object is first created...
for creating an empty list; - an operation for testing whether or not a list is empty;
- an operation for prepending an entity to a list
- an operation for appending an entity to a list
- an operation for determining the first component (or the "head") of a list
- an operation for referring to the list consisting of all the components of a list except for its first (this is called the "tail" of the list.)
Characteristics
Lists have the following properties:- The size of lists. It indicates how many elements there are in the list.
- Equality of lists:
- In mathematics, sometimes equality of lists is defined simply in terms of object identityIdentity (object-oriented programming)An identity in object-oriented programming, object-oriented design and object-oriented analysis describes the property of objects that distinguishes them from other objects. This is closely related to the philosophical concept of identity....
: two lists are equal if and only if they are the same object. - In modern programming languageProgramming languageA programming language is an artificial language designed to communicate instructions to a machine, particularly a computer. Programming languages can be used to create programs that control the behavior of a machine and/or to express algorithms precisely....
s, equality of lists is normally defined in terms of structural equality of the corresponding entries, except that if the lists are typed, then the list types may also be relevant.
- In mathematics, sometimes equality of lists is defined simply in terms of object identity
- Lists may be typed. This implies that the entries in a list must have types that are compatible with the list's type. It is common that lists are typed when they are implemented using arrays.
- Each element in the list has an index. The first element commonly has index 0 or 1 (or some other predefined integer). Subsequent elements have indices that are 1 higher than the previous element. The last element has index
+ − 1. - It is possible to retrieve the element at a particular index.
- It is possible to traverse the list in the order of increasing index.
- It is possible to change the element at a particular index to a different value, without affecting any other elements.
- It is possible to insert an element at a particular index. The indices of higher elements at that are increased by 1.
- It is possible to remove an element at a particular index. The indices of higher elements at that are decreased by 1.
Implementations
Lists are typically implemented either as linked listLinked list
In computer science, a linked list is a data structure consisting of a group of nodes which together represent a sequence. Under the simplest form, each node is composed of a datum and a reference to the next node in the sequence; more complex variants add additional links...
s (either singly or doubly linked) or as arrays, usually variable length or dynamic array
Dynamic array
In computer science, a dynamic array, growable array, resizable array, dynamic table, or array list is a random access, variable-size list data structure that allows elements to be added or removed...
s.
The standard way of implementing lists, originating with the programming language Lisp, is to have each element of the list contain both its value and a pointer indicating the location of the next element in the list. This results in either a linked list
Linked list
In computer science, a linked list is a data structure consisting of a group of nodes which together represent a sequence. Under the simplest form, each node is composed of a datum and a reference to the next node in the sequence; more complex variants add additional links...
or a tree, depending on whether the list has nested sublists. Some older Lisp implementations (such as the Lisp implementation of the Symbolics
Symbolics
Symbolics refers to two companies: now-defunct computer manufacturer Symbolics, Inc., and a privately held company that acquired the assets of the former company and continues to sell and maintain the Open Genera Lisp system and the Macsyma computer algebra system.The symbolics.com domain was...
3600) also supported "compressed lists" (using CDR coding
CDR coding
In computer science CDR coding is a compressed data representation for Lisp linked lists. It was developed and patented by the MIT Artificial Intelligence Laboratory, and implemented in computer hardware in a number of Lisp machines derived from the MIT CADR....
) which had a special internal representation (invisible to the user). Lists can be manipulated using iteration
Iteration
Iteration means the act of repeating a process usually with the aim of approaching a desired goal or target or result. Each repetition of the process is also called an "iteration," and the results of one iteration are used as the starting point for the next iteration.-Mathematics:Iteration in...
or recursion
Recursion
Recursion is the process of repeating items in a self-similar way. For instance, when the surfaces of two mirrors are exactly parallel with each other the nested images that occur are a form of infinite recursion. The term has a variety of meanings specific to a variety of disciplines ranging from...
. The former is often preferred in imperative programming languages
Imperative programming
In computer science, imperative programming is a programming paradigm that describes computation in terms of statements that change a program state...
, while the latter is the norm in functional languages.
Lists can be implemented as self-balancing binary search tree
Self-balancing binary search tree
In computer science, a self-balancing binary search tree is any node based binary search tree that automatically keeps its height small in the face of arbitrary item insertions and deletions....
s holding index-value pairs, providing equal-time access to any element (e.g. all residing in the fringe, and internal nodes storing the right-most child's index, used to guide the search), taking the time logarithmic in the list's size, but as long as it doesn't change much will provide the illusion of random access
Random access
In computer science, random access is the ability to access an element at an arbitrary position in a sequence in equal time, independent of sequence size. The position is arbitrary in the sense that it is unpredictable, thus the use of the term "random" in "random access"...
and enable swap, prefix and append operations in logarithmic time as well.
Programming language support
Some languages do not offer a list data structureData structure
In computer science, a data structure is a particular way of storing and organizing data in a computer so that it can be used efficiently.Different kinds of data structures are suited to different kinds of applications, and some are highly specialized to specific tasks...
, but offer the use of associative array
Associative array
In computer science, an associative array is an abstract data type composed of a collection of pairs, such that each possible key appears at most once in the collection....
s or some kind of table to emulate lists. For example, Lua
Lua programming language
Lua is a lightweight multi-paradigm programming language designed as a scripting language with extensible semantics as a primary goal. Lua has a relatively simple C API compared to other scripting languages.- History :...
provides tables. Although Lua stores lists that have numerical indices as arrays internally, they still appear as hash tables.
In Lisp
Lisp programming language
Lisp is a family of computer programming languages with a long history and a distinctive, fully parenthesized syntax. Originally specified in 1958, Lisp is the second-oldest high-level programming language in widespread use today; only Fortran is older...
, lists are the fundamental data type and can represent both program code and data. In most dialects, the list of the first three prime numbers could be written as
(list 2 3 5)
. In several dialects of Lisp, including Scheme, a list is a collection of pairs, consisting of a value and a pointer to the next pair (or null value), making a singly linked list.Applications
As the name implies, lists can be used to store a list of records. The items in a list can be sorted for the purpose of fast search (binary search).Because in computing, lists are easier to realize than sets, a finite set in mathematical sense can be realized as a list with additional restrictions, that is, duplicate elements are disallowed and such that order is irrelevant. If the list is sorted, it speeds up determining if a given item is already in the set but in order to ensure the order, it requires more time to add new entry to the list. In efficient implementations, however, sets are implemented using self-balancing binary search tree
Self-balancing binary search tree
In computer science, a self-balancing binary search tree is any node based binary search tree that automatically keeps its height small in the face of arbitrary item insertions and deletions....
s or hash table
Hash table
In computer science, a hash table or hash map is a data structure that uses a hash function to map identifying values, known as keys , to their associated values . Thus, a hash table implements an associative array...
s, rather than a list.
Abstract definition
The abstract list type L with elements of some type E (a monomorphicType polymorphism
In computer science, polymorphism is a programming language feature that allows values of different data types to be handled using a uniform interface. The concept of parametric polymorphism applies to both data types and functions...
list) is defined by the following functions:
- nil: → L
- cons: E × L → L
- first: L → E
- rest: L → L
with the axioms
- first (cons (e, l)) = e
- rest (cons (e, l)) = l
for any element e and any list l. It is implicit that
- cons (e, l) ≠ l
- cons (e, l) ≠ e
- cons (e1, l1) = cons (e2, l2) if e1 = e2 and l1 = l2
Note that first (nil ) and rest (nil ) are not defined.
These axioms are equivalent to those of the abstract stack data type.
In type theory
Type theory
In mathematics, logic and computer science, type theory is any of several formal systems that can serve as alternatives to naive set theory, or the study of such formalisms in general...
, the above definition is more simply regarded as an inductive type defined in terms of constructors: nil and cons. In algebraic terms, this can be represented as the transformation 1 + E × L → L. first and rest are then obtained by pattern matching
Pattern matching
In computer science, pattern matching is the act of checking some sequence of tokens for the presence of the constituents of some pattern. In contrast to pattern recognition, the match usually has to be exact. The patterns generally have the form of either sequences or tree structures...
on the cons constructor and separately handling the nil case.
The list monad
The list type forms a monad with the following functions (using E* rather than L to represent monomorphic lists with elements of type E):where append is defined as:
Alternatively, the monad may be defined in terms of operations return, fmap and join, with:
Note that fmap, join, append and bind are well-defined, since they're applied to progressively deeper arguments at each recursive call.
The list type is an additive monad, with nil as the monadic zero and append as monadic sum.
Lists form a monoid
Monoid
In abstract algebra, a branch of mathematics, a monoid is an algebraic structure with a single associative binary operation and an identity element. Monoids are studied in semigroup theory as they are naturally semigroups with identity. Monoids occur in several branches of mathematics; for...
under the append operation. The identity element of the monoid is the empty list, nil. In fact, this is the free monoid over the set of list elements.
See also
- Set (computer science)Set (computer science)In computer science, a set is an abstract data structure that can store certain values, without any particular order, and no repeated values. It is a computer implementation of the mathematical concept of a finite set...
- Queue (data structure)