Collection (computing)
Encyclopedia
In computer science
, a collection is a grouping of some variable number of data items (possibly zero) that have some shared significance to the problem being solved and need to be operated upon together in some controlled fashion. Generally, the data items will be of the same type or, in languages supporting inheritance, derived from some common ancestor type. A table (or array) is usually not considered a collection because it holds a fixed number of items, although tables/arrays commonly play a role in the implementation of collections.
Some different kinds of collections are lists, sets
, bags
(or multisets), trees
and graphs
. An enumerated type
may be either a list or a set.
. If the principal operations are the addition and removal of items at just one end, it will be called a stack
or LIFO. In both cases, items are maintained within the collection in the same order (unless they are removed and re-inserted somewhere else) and so these are special cases of the list collection. Other specialized operations on lists include sorting, where, again, the order of items is of great importance.
, the order of data items is of no consequence, but duplicate items are not permitted. Examples of operations on sets are the addition and removal of items and searching for an item in the set. Some languages support sets directly. In others, sets can be implemented by a hash table
with dummy values; only the keys are used in representing the set.
, is like a set - the order of data items is of no consequence. But in this case, duplicate items are permitted. Examples of operations on bags are the addition and removal of items and determining how many of a particular item are present in the bag. Bags can be transformed into lists by the action of sorting.
or "lookup table" acts like a dictionary, providing a "value" (like a definition) in response to a lookup on a "key" (like a word). The "value" might be a reference to a compound data structure. A hash table
is usually an efficient implementation.
. Tree collections are also used to store data that is presented in a tree-like manner, such as menu systems and files in directories on a data storage system.
, which creates a graph (or mesh) representation of a data network and figures out which associations between switching nodes need to be broken to turn it into a tree and thus prevent data going around in loops.
Computer science
Computer science or computing science is the study of the theoretical foundations of information and computation and of practical techniques for their implementation and application in computer systems...
, a collection is a grouping of some variable number of data items (possibly zero) that have some shared significance to the problem being solved and need to be operated upon together in some controlled fashion. Generally, the data items will be of the same type or, in languages supporting inheritance, derived from some common ancestor type. A table (or array) is usually not considered a collection because it holds a fixed number of items, although tables/arrays commonly play a role in the implementation of collections.
Some different kinds of collections are lists, sets
Set (computer science)
In computer science, a set is an abstract data structure that can store certain values, without any particular order, and no repeated values. It is a computer implementation of the mathematical concept of a finite set...
, bags
Multiset
In mathematics, the notion of multiset is a generalization of the notion of set in which members are allowed to appear more than once...
(or multisets), trees
Tree (data structure)
In computer science, a tree is a widely-used data structure that emulates a hierarchical tree structure with a set of linked nodes.Mathematically, it is an ordered directed tree, more specifically an arborescence: an acyclic connected graph where each node has zero or more children nodes and at...
and graphs
Graph (data structure)
In computer science, a graph is an abstract data structure that is meant to implement the graph and hypergraph concepts from mathematics.A graph data structure consists of a finite set of ordered pairs, called edges or arcs, of certain entities called nodes or vertices...
. An enumerated type
Enumerated type
In computer programming, an enumerated type is a data type consisting of a set of named values called elements, members or enumerators of the type. The enumerator names are usually identifiers that behave as constants in the language...
may be either a list or a set.
Lists
In a list, the order of data items is significant. Duplicate data items are permitted. Examples of operations on lists are searching for an item in the list and determining its location (if it is present), removing an item from the list, adding an item at a specific location, etc. If the principal operations on the list are to be the addition of items at one end and the removal of items at the other, it will generally be called a queue or FIFOFIFO
FIFO is an acronym for First In, First Out, an abstraction related to ways of organizing and manipulation of data relative to time and prioritization...
. If the principal operations are the addition and removal of items at just one end, it will be called a stack
Stack (data structure)
In computer science, a stack is a last in, first out abstract data type and linear data structure. A stack can have any abstract data type as an element, but is characterized by only three fundamental operations: push, pop and stack top. The push operation adds a new item to the top of the stack,...
or LIFO. In both cases, items are maintained within the collection in the same order (unless they are removed and re-inserted somewhere else) and so these are special cases of the list collection. Other specialized operations on lists include sorting, where, again, the order of items is of great importance.
Sets
In a setSet (computer science)
In computer science, a set is an abstract data structure that can store certain values, without any particular order, and no repeated values. It is a computer implementation of the mathematical concept of a finite set...
, the order of data items is of no consequence, but duplicate items are not permitted. Examples of operations on sets are the addition and removal of items and searching for an item in the set. Some languages support sets directly. In others, sets can be implemented by a hash table
Hash table
In computer science, a hash table or hash map is a data structure that uses a hash function to map identifying values, known as keys , to their associated values . Thus, a hash table implements an associative array...
with dummy values; only the keys are used in representing the set.
Bags
A "bag" or multisetMultiset
In mathematics, the notion of multiset is a generalization of the notion of set in which members are allowed to appear more than once...
, is like a set - the order of data items is of no consequence. But in this case, duplicate items are permitted. Examples of operations on bags are the addition and removal of items and determining how many of a particular item are present in the bag. Bags can be transformed into lists by the action of sorting.
Associative arrays
An associative arrayAssociative array
In computer science, an associative array is an abstract data type composed of a collection of pairs, such that each possible key appears at most once in the collection....
or "lookup table" acts like a dictionary, providing a "value" (like a definition) in response to a lookup on a "key" (like a word). The "value" might be a reference to a compound data structure. A hash table
Hash table
In computer science, a hash table or hash map is a data structure that uses a hash function to map identifying values, known as keys , to their associated values . Thus, a hash table implements an associative array...
is usually an efficient implementation.
Trees
In a tree, a 'root' data item has associated with it some number of data items which in turn have associated with them some number of other items in what is frequently viewed as parent-child relationships. Every item (other than the root) has a single parent (the root has no parent) and some number of children, possibly zero. Examples of operations on trees are the addition of data items so as to maintain a specific property of the tree to perform sorting, etc. and traversals to visit data items in a specific sequence. A tree used for sorting operations is usually called a heapHeap (data structure)
In computer science, a heap is a specialized tree-based data structure that satisfies the heap property: if B is a child node of A, then key ≥ key. This implies that an element with the greatest key is always in the root node, and so such a heap is sometimes called a max-heap...
. Tree collections are also used to store data that is presented in a tree-like manner, such as menu systems and files in directories on a data storage system.
Graphs
In a graph, data items have associations with one or more other data items in the collection and are somewhat like trees without the concept of a root or the parent-child relationship so that all data items are peers. Examples of operations on graphs are traversals and searches which explore the associations of data items looking for some specific property. Graphs are frequently used to model real-world situations and to solve related problems. An example is the Spanning tree protocolSpanning tree protocol
The Spanning Tree Protocol is a network protocol that ensures a loop-free topology for any bridged Ethernet local area network. The basic function of STP is to prevent bridge loops and ensuing broadcast radiation...
, which creates a graph (or mesh) representation of a data network and figures out which associations between switching nodes need to be broken to turn it into a tree and thus prevent data going around in loops.
Abstract concept vs. implementation
As described here, a collection and the various kinds of collections are abstract concepts. There exists in the literature considerable confusion between the abstract concepts of computer science and their specific implementations in various languages or kinds of languages. Assertions that collections, lists, sets, trees, etc. are data structures, abstract data types or classes must be read with this in mind. Collections are first and foremost abstractions that are useful in formulating solutions to computing problems. Viewed in this light, they retain important links to underlying mathematical concepts which can be lost when the focus is on the implementation.External links
- Apache Commons Collections
- Guava
- Mango Java library
- CollectionSpy — A profiler for Java's Collections Framework.
- AS3Commons Collections Framework ActionScript3 implementation of the most common collections.