Column (data store)
Encyclopedia
A column of a distributed data store
is a NoSQL object of the lowest level in a keyspace. It is a tuple
(a key-value pair) consisting of three elements:
, distributed data stores cannot guarantee consistency, as availability
is a more important issue. Therefore, the data store or the application programmer will use the timestamp to find out which of the stored values in the backup nodes are up-to-date.
Some data stores, like Apache Cassandra 0.7, may use the more sophisticated vector clock instead of the timestamp to resolve stale information.
s, a column is a part of a relational table that can be seen in each row of the table. This is not the case in distributed data stores, where the concept of a table only vaguely exists. A column can be part of a ColumnFamily that resembles at most a relational row, but it may appear in one row and not in the others. Also, the number of columns may change from row to row, and new updates to the data store model may also modify the column number. So, all the work of keeping up with changes relies on the application programmer.
notation, three column definitions are given:
Distributed data store
A distributed data store is a blurred concept and means either a distributed database where users store their information on a number of nodes, or a network in which a user stores their information on a number of peer network nodes ....
is a NoSQL object of the lowest level in a keyspace. It is a tuple
Tuple
In mathematics and computer science, a tuple is an ordered list of elements. In set theory, an n-tuple is a sequence of n elements, where n is a positive integer. There is also one 0-tuple, an empty sequence. An n-tuple is defined inductively using the construction of an ordered pair...
(a key-value pair) consisting of three elements:
- Unique name: Used to reference the column
- Value: The content of the column. It can have different types, like
AsciiType
,LongType
,TimeUUIDType
,UTF8Type
among others. - TimestampTimestampA timestamp is a sequence of characters, denoting the date or time at which a certain event occurred. A timestamp is the time at which an event is recorded by a computer, not the time of the event itself...
: The system timestamp used to determine the valid content.
Usage
The column is used as a store for the value and has a timestamp that is used to differentiate the valid content from stale ones. According to the CAP theoremCAP theorem
In theoretical computer science the CAP theorem, also known as Brewer's theorem, states that it is impossible for a distributed computer system to simultaneously provide all three of the following guarantees:...
, distributed data stores cannot guarantee consistency, as availability
Availability
In telecommunications and reliability theory, the term availability has the following meanings:* The degree to which a system, subsystem, or equipment is in a specified operable and committable state at the start of a mission, when the mission is called for at an unknown, i.e., a random, time...
is a more important issue. Therefore, the data store or the application programmer will use the timestamp to find out which of the stored values in the backup nodes are up-to-date.
Some data stores, like Apache Cassandra 0.7, may use the more sophisticated vector clock instead of the timestamp to resolve stale information.
Differences to a relational database
In relational databaseRelational database
A relational database is a database that conforms to relational model theory. The software used in a relational database is called a relational database management system . Colloquial use of the term "relational database" may refer to the RDBMS software, or the relational database itself...
s, a column is a part of a relational table that can be seen in each row of the table. This is not the case in distributed data stores, where the concept of a table only vaguely exists. A column can be part of a ColumnFamily that resembles at most a relational row, but it may appear in one row and not in the others. Also, the number of columns may change from row to row, and new updates to the data store model may also modify the column number. So, all the work of keeping up with changes relies on the application programmer.
Examples
In JSON-likeJSON
JSON , or JavaScript Object Notation, is a lightweight text-based open standard designed for human-readable data interchange. It is derived from the JavaScript scripting language for representing simple data structures and associative arrays, called objects...
notation, three column definitions are given: