Timestamp-based concurrency control
Encyclopedia
In computer science
, in the field of database
s, timestamp-based concurrency control is a non-lock concurrency control
method, used in relational database
s to safely handle transactions, using timestamps.
, or any other strictly totally ordered
sequence: . Every transaction is also given an initially empty set of transactions upon which it depends, , and an initially empty set of old objects which it updated, .
Each object
in the database is given two timestamp fields which are not used other than for concurrency control: is the time at which the value of object was last used by a transaction, is the time at which the value of the object was last updated by a transaction.
For all :
To abort:
Every object in the database has a read timestamp, which is updated whenever the object's data is read, and a write timestamp, which is updated whenever the object's data is changed.
If a transaction wants to read an object,
If a transaction wants to write to an object,
.
Note that timestamp ordering in its basic form does not produce recoverable histories. Consider for example the following history with transactions and :
This could be produced by a TO scheduler, but is not recoverable, as commits even though having read from an uncomitted transaction. To make sure the it produces recoverable histories, a scheduler can keep a list of other transactions each transaction has read from, and not let a transaction commit before this list consisted of only committed transactions. To avoid cascading aborts, the scheduler could tag data written by uncommitted transactions as dirty, and never let a read operation commence on such a data item before it was untagged. To get a strict history, the scheduler should not allow any operations on dirty items.
Computer science
Computer science or computing science is the study of the theoretical foundations of information and computation and of practical techniques for their implementation and application in computer systems...
, in the field of database
Database
A database is an organized collection of data for one or more purposes, usually in digital form. The data are typically organized to model relevant aspects of reality , in a way that supports processes requiring this information...
s, timestamp-based concurrency control is a non-lock concurrency control
Non-lock concurrency control
In Computer Science, in the field of databases, non-lock concurrency control is a concurrency control method used in relational databases without using locking....
method, used in relational database
Relational database
A relational database is a database that conforms to relational model theory. The software used in a relational database is called a relational database management system . Colloquial use of the term "relational database" may refer to the RDBMS software, or the relational database itself...
s to safely handle transactions, using timestamps.
Assumptions
- Every timestamp value is unique and accurately represents an instant in time.
- No two timestamps can be the same.
- A higher-valued timestamp occurs later in time than a lower-valued timestamp.
Generating a Timestamp
A number of different ways have been used to generate timestamp- Use the value of the system's clock at the start of a transaction as the timestamp.
- Use a thread-safe shared counter that is incremental at the start of a transaction as the timestamp.
- A combination of the above two methods.
Formal
Each transaction () is an ordered list of actions (). Before the transaction performs its first action (), it is marked with the current timestampTimestamp
A timestamp is a sequence of characters, denoting the date or time at which a certain event occurred. A timestamp is the time at which an event is recorded by a computer, not the time of the event itself...
, or any other strictly totally ordered
Total order
In set theory, a total order, linear order, simple order, or ordering is a binary relation on some set X. The relation is transitive, antisymmetric, and total...
sequence: . Every transaction is also given an initially empty set of transactions upon which it depends, , and an initially empty set of old objects which it updated, .
Each object
Object (computer science)
In computer science, an object is any entity that can be manipulated by the commands of a programming language, such as a value, variable, function, or data structure...
in the database is given two timestamp fields which are not used other than for concurrency control: is the time at which the value of object was last used by a transaction, is the time at which the value of the object was last updated by a transaction.
For all :
- For each action :
- If wishes to use the value of :
- If then abort (a more recent thread has overwritten the value),
- Otherwise update the set of dependencies and set ;
- If wishes to update the value of :
- If then abort (a more recent thread is already relying on the old value),
- If then skip (the Thomas Write Rule),
- Otherwise store the previous values, , set , and update the value of .
- If wishes to use the value of :
- While there is a transaction in that has not ended: wait
- If there is a transaction in that aborted then abort
- Otherwise: commit.
To abort:
- For each in
- If equals then restore and
Informal
Whenever a transaction starts, it is given a timestamp. This is so we can tell which order that the transactions are supposed to be applied in. So given two transactions that affect the same object, the transaction that has the earlier timestamp is meant to be applied before the other one. However, if the wrong transaction is actually presented first, it is aborted and must be restarted.Every object in the database has a read timestamp, which is updated whenever the object's data is read, and a write timestamp, which is updated whenever the object's data is changed.
If a transaction wants to read an object,
- but the transaction started before the object's write timestamp it means that something changed the object's data after the transaction started. In this case, the transaction is canceled and must be restarted.
- and the transaction started after the object's write timestamp, it means that it is safe to read the object. In this case, if the transaction timestamp is after the object's read timestamp, the read timestamp is set to the transaction timestamp.
If a transaction wants to write to an object,
- but the transaction started before the object's read timestamp it means that something has had a look at the object, and we assume it took a copy of the object's data. So we can't write to the object as that would make any copied data invalid, so the transaction is aborted and must be restarted.
- and the transaction started before the object's write timestamp it means that something has changed the object since we started our transaction. In this case we use the Thomas Write Rule and simply skip our write operation and continue as normal; the transaction does not have to be aborted or restarted
- otherwise, the transaction writes to the object, and the object's write timestamp is set to the transaction's timestamp.
Recoverability
For an explanation of the terms recoverable (RC), avoids cascading aborts (ACA) and strict (ST) see Schedule (computer science)Schedule (computer science)
In the fields of databases and transaction processing , a schedule of a system is an abstract model to describe execution of transactions running in the system. Often it is a list of operations ordered by time, performed by a set of transactions that are executed together in the system...
.
Note that timestamp ordering in its basic form does not produce recoverable histories. Consider for example the following history with transactions and :
This could be produced by a TO scheduler, but is not recoverable, as commits even though having read from an uncomitted transaction. To make sure the it produces recoverable histories, a scheduler can keep a list of other transactions each transaction has read from, and not let a transaction commit before this list consisted of only committed transactions. To avoid cascading aborts, the scheduler could tag data written by uncommitted transactions as dirty, and never let a read operation commence on such a data item before it was untagged. To get a strict history, the scheduler should not allow any operations on dirty items.