Index (information technology)
Encyclopedia
In computer science
, an index can be:
integer
. Indices are also called subscripts. An index maps the array value to a stored object.
There are three ways in which the elements of an array can be indexed:
Arrays can have multiple dimensions, thus it is not uncommon to access an array using multiple indices. For example a two dimensional array
(N) or linear time. Since data stores commonly contain millions of objects and since lookup is a common operation, it is often desirable to improve on this performance.
An index is any data structure that improves the performance of lookup. There are many different data structures used for this purpose, and in fact a substantial proportion of the field of Computer Science is devoted to the design and analysis of index data structures. There are complex design trade-offs involving lookup performance, index size, and index update performance. Many index designs exhibit logarithmic (O
(log(N)) lookup performance and in some applications it is possible to achieve flat (O
(1)) performance.
All database
software includes indexing technology in the interests of improving performance. See Index (database)
.
One specific and very common application is in the domain of information retrieval
, where the application of a full-text index
enables rapid identification of documents based on their textual content.
Computer science
Computer science or computing science is the study of the theoretical foundations of information and computation and of practical techniques for their implementation and application in computer systems...
, an index can be:
- an integer that identifies an array element
- a data structureData structureIn computer science, a data structure is a particular way of storing and organizing data in a computer so that it can be used efficiently.Different kinds of data structures are suited to different kinds of applications, and some are highly specialized to specific tasks...
that enables sublinear-time lookup (an associative arrayAssociative arrayIn computer science, an associative array is an abstract data type composed of a collection of pairs, such that each possible key appears at most once in the collection....
)
Array element identifier
When data objects are stored in an array, individual objects are selected by an index that is usually a non-negative scalarScalar (computing)
In computing, a scalar variable or field is one that can hold only one value at a time; as opposed to composite variables like array, list, hash, record, etc. In some contexts, a scalar value may be understood to be numeric. A scalar data type is the type of a scalar variable...
integer
Integer
The integers are formed by the natural numbers together with the negatives of the non-zero natural numbers .They are known as Positive and Negative Integers respectively...
. Indices are also called subscripts. An index maps the array value to a stored object.
There are three ways in which the elements of an array can be indexed:
- 0 (zero-based indexing): The first element of the array is indexed by subscript of 0.
- 1 (one-based indexing): The first element of the array is indexed by subscript of 1.
- n (n-based indexing): The base index of an array can be freely chosen. Usually programming languages allowing n-based indexing also allow negative index values and other scalarScalar (computing)In computing, a scalar variable or field is one that can hold only one value at a time; as opposed to composite variables like array, list, hash, record, etc. In some contexts, a scalar value may be understood to be numeric. A scalar data type is the type of a scalar variable...
data types like enumerationsEnumerated typeIn computer programming, an enumerated type is a data type consisting of a set of named values called elements, members or enumerators of the type. The enumerator names are usually identifiers that behave as constants in the language...
, or charactersCharacter (computing)In computer and machine-based telecommunications terminology, a character is a unit of information that roughly corresponds to a grapheme, grapheme-like unit, or symbol, such as in an alphabet or syllabary in the written form of a natural language....
may be used as an array index.
Arrays can have multiple dimensions, thus it is not uncommon to access an array using multiple indices. For example a two dimensional array
A
with three rows and four columns might provide access to the element at the 2nd row and 4th column by the expression: A[1, 3]
(in a row major language) and A[3, 1]
(in a column major language) in the case of a zero-based indexing system. Thus two indices are used for a two dimensional array, three for a three dimensional array, and n for an n dimensional array.Support for fast lookup
Suppose a data store contains N data objects, and it is desired to retrieve one of them based on the value of one of the object's fields. A naive implementation would retrieve and examine each object until a match was found. A successful lookup would retrieve half the objects on average; an unsuccessful lookup all of them for each attempt. This means that the number of operations in the worst case is ΩBig O notation
In mathematics, big O notation is used to describe the limiting behavior of a function when the argument tends towards a particular value or infinity, usually in terms of simpler functions. It is a member of a larger family of notations that is called Landau notation, Bachmann-Landau notation, or...
(N) or linear time. Since data stores commonly contain millions of objects and since lookup is a common operation, it is often desirable to improve on this performance.
An index is any data structure that improves the performance of lookup. There are many different data structures used for this purpose, and in fact a substantial proportion of the field of Computer Science is devoted to the design and analysis of index data structures. There are complex design trade-offs involving lookup performance, index size, and index update performance. Many index designs exhibit logarithmic (O
Big O notation
In mathematics, big O notation is used to describe the limiting behavior of a function when the argument tends towards a particular value or infinity, usually in terms of simpler functions. It is a member of a larger family of notations that is called Landau notation, Bachmann-Landau notation, or...
(log(N)) lookup performance and in some applications it is possible to achieve flat (O
Big O notation
In mathematics, big O notation is used to describe the limiting behavior of a function when the argument tends towards a particular value or infinity, usually in terms of simpler functions. It is a member of a larger family of notations that is called Landau notation, Bachmann-Landau notation, or...
(1)) performance.
All database
Database
A database is an organized collection of data for one or more purposes, usually in digital form. The data are typically organized to model relevant aspects of reality , in a way that supports processes requiring this information...
software includes indexing technology in the interests of improving performance. See Index (database)
Index (database)
A database index is a data structure that improves the speed of data retrieval operations on a database table at the cost of slower writes and increased storage space...
.
One specific and very common application is in the domain of information retrieval
Information retrieval
Information retrieval is the area of study concerned with searching for documents, for information within documents, and for metadata about documents, as well as that of searching structured storage, relational databases, and the World Wide Web...
, where the application of a full-text index
Index (search engine)
Search engine indexing collects, parses, and stores data to facilitate fast and accurate information retrieval. Index design incorporates interdisciplinary concepts from linguistics, cognitive psychology, mathematics, informatics, physics, and computer science...
enables rapid identification of documents based on their textual content.
See also
- Binary search algorithmBinary search algorithmIn computer science, a binary search or half-interval search algorithm finds the position of a specified value within a sorted array. At each stage, the algorithm compares the input key value with the key value of the middle element of the array. If the keys match, then a matching element has been...
Fast lookup for sorted lists sometimes known as the "binary chop" method - Comparison of programming languages (array)Comparison of programming languages (array)- Array dimensions :The following list contains Syntax examples on how to determine the dimensions :- Indexing :...
- Hash tableHash tableIn computer science, a hash table or hash map is a data structure that uses a hash function to map identifying values, known as keys , to their associated values . Thus, a hash table implements an associative array...
Creating an index for using indexed 'lookup' where keys are not sequential - IndexerIndexerAn indexer may refer to* a torrent indexer* an indexer in programming* the indexer of the intrada viewer used with Goobi...
- Inverted indexInverted indexIn computer science, an inverted index is an index data structure storing a mapping from content, such as words or numbers, to its locations in a database file, or in a document or a set of documents...