Two-phase locking

In database

Database

A database is an organized collection of data for one or more purposes, usually in digital form. The data are typically organized to model relevant aspects of reality , in a way that supports processes requiring this information...

s and transaction processing

Transaction processing

In computer science, transaction processing is information processing that is divided into individual, indivisible operations, called transactions. Each transaction must succeed or fail as a complete unit; it cannot remain in an intermediate state...

two-phase locking, (2PL) is a concurrency control

Concurrency control

In information technology and computer science, especially in the fields of computer programming , operating systems , multiprocessors, and databases, concurrency control ensures that correct results for concurrent operations are generated, while getting those results as quickly as possible.Computer...

method that guarantees serializability

Serializability

In concurrency control of databases, transaction processing , and various transactional applications , both centralized and distributed, a transaction schedule is serializable, has the serializability property, if its outcome In concurrency control of databases, transaction processing (transaction...

.
It is also the name of the resulting set of database transaction

Database transaction

A transaction comprises a unit of work performed within a database management system against a database, and treated in a coherent and reliable way independent of other transactions...

schedule

Schedule (computer science)

In the fields of databases and transaction processing , a schedule of a system is an abstract model to describe execution of transactions running in the system. Often it is a list of operations ordered by time, performed by a set of transactions that are executed together in the system...

s (histories). The protocol utilizes locks

Lock (computer science)

In computer science, a lock is a synchronization mechanism for enforcing limits on access to a resource in an environment where there are many threads of execution. Locks are one way of enforcing concurrency control policies.-Types:...

that block (interpreted as signals to stop) other transactions from accessing the same data during a transaction's life.

By the 2PL protocol locks are applied and removed in two phases:

Expanding phase: locks are acquired and no locks are released.
Shrinking phase: locks are released and no locks are acquired.

Two types of locks are utilized by the basic protocol: Shared and Exclusive locks. Refinements of the basic protocol may utilize more lock types. Using locks that block processes, 2PL may be subject to deadlock

Deadlock

A deadlock is a situation where in two or more competing actions are each waiting for the other to finish, and thus neither ever does. It is often seen in a paradox like the "chicken or the egg"...

s that result from the mutual blocking of two or more transactions.

2PL is a superset

SuperSet

SuperSet Software was a group founded by friends and former Eyring Research Institute co-workers Drew Major, Dale Neibaur, Kyle Powell and later joined by Mark Hurst...

of strong strict two-phase locking (SS2PL), also called rigorousness, which has been widely utilized for concurrency control in general-purpose database systems since the 1970s. SS2PL implementations have many variants. SS2PL was called strict 2PL but this name usage is not recommended now. Now strict 2PL (S2PL) is the intersection of strictness and 2PL, which is different from SS2PL. SS2PL is also a special case

Special case

In logic, especially as applied in mathematics, concept A is a special case or specialization of concept B precisely if every instance of A is also an instance of B, or equivalently, B is a generalization of A. For example, all circles are ellipses ; therefore the circle is a special case of the...

of commitment ordering

Commitment ordering

In concurrency control of databases, transaction processing , and related applications, Commitment ordering is a class of interoperable Serializability techniques, both centralized and distributed. It allows optimistic implementations...

, and inherits many of CO's useful properties. SS2PL actually comprises only one phase: phase-2 does not exist, and all locks are released only after transaction end. Thus this useful 2PL type is not two-phased at all.

Neither 2PL nor S2PL in their general forms are known to be used in practice. Thus 2PL by itself does not seem to have much practical importance, and whenever 2PL or S2PL utilization has been mentioned in the literature, the intention has been SS2PL. What has made SS2PL so popular (probably the most utilized serializability

Serializability

mechanism) is the effective and efficient locking-based combination of two ingredients (the first does not exist in both general 2PL and S2PL; the second does not exist in general 2PL):

Commitment ordering
Commitment ordering
In concurrency control of databases, transaction processing , and related applications, Commitment ordering is a class of interoperable Serializability techniques, both centralized and distributed. It allows optimistic implementations...

, which provides both serializability
Serializability
In concurrency control of databases, transaction processing , and various transactional applications , both centralized and distributed, a transaction schedule is serializable, has the serializability property, if its outcome In concurrency control of databases, transaction processing (transaction...

, and effective distributed serializability and global serializability
Global serializability
In concurrency control of databases, transaction processing , and other transactional distributed applications, Global serializability is a property of a global schedule of transactions...

(without the need in the not scalable
Scalability
In electronics scalability is the ability of a system, network, or process, to handle growing amount of work in a graceful manner or its ability to be enlarged to accommodate that growth...

distributed lock manager
Distributed lock manager
A distributed lock manager provides distributed software applications with a means to synchronize their accesses to shared resources....

, and while distributed deadlocks are resolved automatically), and
Strictness, which provides cascadelessness (ACA, cascade-less recoverability) and (independently) allows efficient database recovery
Data recovery
Data recovery is the process of salvaging data from damaged, failed, corrupted, or inaccessible secondary storage media when it cannot be accessed normally. Often the data are being salvaged from storage media such as internal or external hard disk drives, solid-state drives , USB flash drive,...

from failure.

Additionally SS2PL is easier, with less overhead to implement than both 2PL and S2PL, provides exactly same locking, but sometimes releases locks later. However, practically (though not simplistically theoretically) such later lock release occurs only slightly later, and this apparent disadvantage is insignificant and disappears next to the advantages of SS2PL.

Thus, the importance of the general Two-phase locking (2PL) is historic only, while Strong strict two-phase locking (SS2PL) is practically the important mechanism and resulting schedule property.

Data-access locks

A lock

Lock (computer science)

is a system object associated with a shared resource such as a data item of an elementary type, a row in a database, or a page of memory. In a database, a lock on a database object (a data-access lock) may need to be acquired by a transaction before accessing the object. Correct use of locks prevents undesired, incorrect or inconsistent operations on shared resources by other concurrent transactions. When a database object with an existing lock acquired by one transaction needs to be accessed by another transaction, the existing lock for the object and the type of the intended access are checked by the system. If the existing lock type does not allow this specific attempted concurrent access type, the transaction attempting access is blocked (according to a predefined agreement/scheme). In practice a lock on an object does not directly block a transaction's operation upon the object, but rather blocks that transaction from acquiring another lock on the same object, needed to be held/owned by the transaction before performing this operation. Thus, with a locking mechanism, needed operation blocking is controlled by a proper lock blocking scheme, which indicates which lock type blocks which lock type.

Two major types of locks are utilized:

Write-lock (exclusive lock) is associated with a database object by a transaction (Terminology: "the transaction locks the object," or "acquires lock for it") before writing (inserting/modifying/deleting) this object.
Read-lock (shared lock) is associated with a database object by a transaction before reading (retrieving the state of) this object.

The common interactions between these lock types are defined by blocking behavior as follows:

An existing write-lock on a database object blocks an intended write upon the same object (already requested/issued) by another transaction by blocking a respective write-lock from being acquired by the other transaction. The second write-lock will be acquired and the requested write of the object will take place (materialize) after the existing write-lock is released.
A write-lock blocks an intended (already requested/issued) read by another transaction by blocking the respective read-lock .
A read-lock blocks an intended write by another transaction by blocking the respective write-lock .
A read-lock does not block an intended read by another transaction. The respective read-lock for the intended read is acquired (shared with the previous read) immediately after the intended read is requested, and then the intended read itself takes place.

Several variations and refinements of these major lock types exist, with respective variations of blocking behavior. If a first lock blocks another lock, the two locks are called incompatible; otherwise the locks are compatible. Often lock types blocking interactions are presented in the technical literature by a Lock compatibility table. The following is an example with the common, major lock types:

Lock compatibility table
Lock type	read-lock	write-lock
read-lock		X
write-lock	X	X

X indicates incompatibility, i.e, a case when a lock of the first type (in left column) on an object blocks a lock of the second type (in top row) from being acquired on the same object (by another transaction). An object typically has a queue of waiting requested (by transactions) operations with respective locks. The first blocked lock for operation in the queue is acquired as soon as the existing blocking lock is removed from the object, and then its respective operation is executed. If a lock for operation in the queue is not blocked by any existing lock (existence of multiple compatible locks on a same object is possible concurrently) it is acquired immediately.

Comment: In some publications the table entries are simply marked "compatible" or "incompatible", or respectively "yes" or "no".

According to the two-phase locking protocol a transaction handles its locks in two distinct, consecutive phases during the transaction's execution:

Expanding phase (aka Growing phase): locks are acquired and no locks are released (the number of locks can only increase).
Shrinking phase: locks are released and no locks are acquired.

The serializability

Serializability

property is guaranteed for a schedule with transactions that obey the protocol. The 2PL schedule class is defined as the class of all the schedules comprising transactions with data access orders that could be generated by the 2PL protocol (or in other words, all the schedules that the 2PL protocol can generate).

Typically, without explicit knowledge in a transaction on end of phase-1, it is safely determined only when a transaction has entered its ready state in all its processes (processing has ended, and it is ready to be committed; no additional data access and locking are needed and can happen). In this case phase-2 can end immediately (no additional processing is needed), and actually no phase-2 is needed. Also, if several processes (two or more) are involved, then a synchronization point (similar to atomic commitment) among them is needed to determine end of phase-1 for all of them (i.e., in the entire distributed transaction), to start releasing locks in phase-2 (otherwise it is very likely that both 2PL and Serializability are quickly violated). Such synchronization point is usually too costly (involving a distributed protocol similar to atomic commitment), and end of phase-1 is usually postponed to be merged with transaction end (atomic commitment protocol for a multi-process transaction), and again phase-2 is not needed. This turns 2PL to SS2PL (see below). All known implementations of 2PL in products are SS2PL based, and whenever 2PL (or Strict 2PL, S2PL) practical utilization has been mentioned in the professional literature, the intention has been SS2PL.

Strict two-phase locking

The strict two-phase locking (S2PL) class of schedules is the intersection of the 2PL class with the class of schedules possessing the Strictness property.

To comply with the S2PL protocol a transaction needs to comply with 2PL, and release its write (exclusive) locks only after it has ended, i.e., being either committed or aborted. On the other hand, read (shared) locks are released regularly during phase 2. Implementing general S2PL requires explicit support of phase-1 end, separate from transaction end, and no such widely utilized product implementation is known.

S2PL is a special case of 2PL, i.e., the S2PL class is a proper subclass of 2PL.

Strong strict two-phase locking

or Rigorousness, or Rigorous scheduling, or Rigorous two-phase locking

To comply with strong strict two-phase locking (SS2PL) the locking protocol releases both write (exclusive) and read (shared) locks applied by a transaction only after the transaction has ended, i.e., only after both completing executing (being ready) and becoming either committed or aborted. This protocol also complies with the S2PL rules. A transaction obeying SS2PL can be viewed as having phase-1 that lasts the transaction's entire execution duration, and no phase-2 (or a degenerate phase-2). Thus, only one phase is actually left, and "two-phase" in the name seems to be still utilized due to the historical development of the concept from 2PL, and 2PL being a super-class. The SS2PL property of a schedule is also called Rigorousness. It is also the name of the class of schedules having this property, and an SS2PL schedule is also called a "rigorous schedule". The term "Rigorousness" is free of the unnecessary legacy of "two-phase," as well as being independent of any (locking) mechanism (in principle other blocking mechanisms can be utilized). The property's respective locking mechanism is sometimes referred to as Rigorous 2PL.

SS2PL is a special case of S2PL, i.e., the SS2PL class of schedules is a proper subclass of S2PL (every SS2PL schedule is also an S2PL schedule, but S2PL schedules exist that are not SS2PL).

SS2PL has been the concurrency control protocol of choice for most database system

Database system

A database system is a term that is typically used to encapsulate the constructs of a data model, database Management system and database....

s and utilized since their early days in the 1970s. It is proven to be an effective mechanism in many situations, and provides besides Serializability

Serializability

also Strictness (a special case of cascadeless Recoverability), which is instrumental for efficient database

Database

recovery

Data recovery

Data recovery is the process of salvaging data from damaged, failed, corrupted, or inaccessible secondary storage media when it cannot be accessed normally. Often the data are being salvaged from storage media such as internal or external hard disk drives, solid-state drives , USB flash drive,...

, and also Commitment ordering

Commitment ordering

(CO) for participating in distributed environments where a CO based distributed serializability and global serializability

Global serializability

In concurrency control of databases, transaction processing , and other transactional distributed applications, Global serializability is a property of a global schedule of transactions...

solutions are employed. Being a subset of CO, an efficient implementation of distributed SS2PL exists without a distributed lock manager

Distributed lock manager

A distributed lock manager provides distributed software applications with a means to synchronize their accesses to shared resources....

(DLM), while distributed deadlocks (see below) are resolved automatically. The fact that SS2PL employed in multi database systems ensures global serializability has been known for years before the discovery of CO, but only with CO came the understanding of the role of an atomic commitment protocol in maintaining global serializability, as well as the observation of automatic distributed deadlock resolution (see a detailed example of Distributed SS2PL). As a matter of fact, SS2PL inheriting properties of Recoverability and CO is more significant than being a subset of 2PL, which by itself in its general form, besides comprising a simple serializability mechanism (however serializability is also implied by CO), in not known to provide SS2PL with any other significant qualities. 2PL in its general form, as well as when combined with Strictness, i.e., Strict 2PL (S2PL), are not known to be utilized in practice. The popular SS2PL does not require to mark "end of phase-1" as 2PL and S2PL do, and thus is simpler to implement. Also unlike the general 2PL, SS2PL provides, as mentioned above, the useful Strictness and Commitment ordering

Commitment ordering

properties.

Many variants of SS2PL exist that utilize various lock types with various semantics in different situations, including cases of lock-type change during a transaction. Notable are variants that use Multiple granularity locking

Multiple granularity locking

In computer science, multiple granularity locking , sometimes called the John Rayner locking method, is a locking method used in database management systems and relational databases....

.

Comments:

SS2PL Vs. S2PL: Both provide Serializability and Strictness. Since S2PL is a super class of SS2PL it may, in principle, provide more concurrency. However no concurrency advantage is typically practically noticed (exactly same locking exists for both, with practically not much earlier lock release for S2PL), and the overhead of dealing with an end-of-phase-1 mechanism in S2PL, separate from transaction-end, is not justified. Also, while SS2PL provides Commitment ordering
Commitment ordering
In concurrency control of databases, transaction processing , and related applications, Commitment ordering is a class of interoperable Serializability techniques, both centralized and distributed. It allows optimistic implementations...

, S2PL does not. This explains the preference of SS2PL over S2PL.
Especially before 1990, but also after, in many articles and books, e.g., (Bernstein et al. 1987, p. 59), the term "Strict 2PL" (S2PL) has been frequently defined by the locking protocol "Release all locks only after transaction end," which is the protocol of SS2PL. Thus, "Strict 2PL" could not be there the name of the intersection of Strictness and 2PL, which is larger than the class generated by the SS2PL protocol. This has caused confusion. With an explicit definition of S2PL as the intersection of Strictness and 2PL, a new name for SS2PL, and an explicit distinction between the classes S2PL and SS2PL, the articles (Breitbart et al. 1991) and (Raz 1992) have intended to clear the confusion: The first using the name "Rigorousness," and the second "SS2PL."
A more general property than SS2PL exists (a schedule super-class), Strict commitment ordering (Strict CO, or SCO), which as well provides both serializability, strictness, and CO, and has similar locking overhead. Unlike SS2PL, SCO does not block upon a read-write conflict (a read-lock does not block acquiring a write-lock; both SCO and SS2PL have the same behavior for write-read and write-write conflicts) at the cost of a possible delayed commit, and upon such conflict type SCO has shorter average transaction completion time and better performance than SS2PL. While SS2PL obeys the lock compatibility table above, SCO has the following table:

{| class="wikitable" style="text-align:center;"

|+Lock compatibility for SCO
|-
! Lock type !! read-lock !! write-lock
|-
! read-lock
| ||
|-
! write-lock
| X || X
|}

Note that though SCO releases all locks at transaction end and complies with the 2PL locking rules, SCO is not a subset of 2PL because of its different lock compatibility table. SCO allows materialized read-write conflicts between two transactions in their phases 1, which 2PL does not allow in phase-1 (see about materialized conflicts in Serializability

Serializability

). On the other hand 2PL allows other materialized conflict types in phase-2 that SCO does not allow at all. Together this implies that the schedule classes 2PL and SCO are incomparable (i.e., no class contains the other class).

Summary - Relationships among classes

Between any two schedule classes (define by their schedules' respective properties) that have common schedules, either one contains the other (strictly contains if they are not equal), or they are incomparable. The containment relationships among the 2PL classes and other major schedule classes are summarized in the following diagram. 2PL and its subclasses are inherently blocking, which means that no optimistic implementations for them exist (and whenever "Optimistic 2PL" is mentioned it refers to a different mechanism with a class that includes also schedules not in the 2PL class).

Deadlocks in 2PL

Locks block data-access operations. Mutual blocking between transactions results in a deadlock
Deadlock
A deadlock is a situation where in two or more competing actions are each waiting for the other to finish, and thus neither ever does. It is often seen in a paradox like the "chicken or the egg"...

, where execution of these transactions is stalled, and no completion can be reached. Thus deadlocks need to be resolved to complete these transactions' executions and release related computing resources. A deadlock is a reflection of a potential cycle in the precedence graph, that would occur without the blocking. A deadlock is resolved by aborting a transaction involved with such potential cycle, and breaking the cycle. It is often detected using a wait-for graph
Wait-For Graph
A Wait-For Graph in computer science is a directed graph used for deadlock detection in operating systems and relational database systems.In Computer Science, a system that allows concurrent operation of multiple processes and locking of resources and which does not provide mechanisms to avoid or...

(a graph of conflicts blocked by locks from being materialized; conflicts not materialized in the database due to blocked operations are not reflected in the precedence graph and do not affect serializability
Serializability
In concurrency control of databases, transaction processing , and various transactional applications , both centralized and distributed, a transaction schedule is serializable, has the serializability property, if its outcome In concurrency control of databases, transaction processing (transaction...

), which indicates which transaction is "waiting for" lock release by which transaction, and a cycle means a deadlock. Aborting one transaction per cycle is sufficient to break the cycle. Transactions aborted due to deadlock resolution are executed again immediately.

In a distributed environment an atomic commitment protocol, typically the Two-phase commit (2PC) protocol, is utilized for atomicity. When recoverable data (data under transaction control) are partitioned among 2PC participants (i.e., each data object is controlled by a single 2PC participant), then distributed (global) deadlocks, deadlocks involving two or more participants in 2PC, are resolved automatically as follows:

When SS2PL is effectively utilized in a distributed environment, then global deadlocks due to locking generate voting-deadlocks in 2PC, and are resolved automatically by 2PC (see Commitment ordering

Commitment ordering

(CO), in Exact characterization of voting-deadlocks by global cycles; No reference except the CO articles is known to notice this). For the general case of 2PL, global deadlocks are similarly resolved automatically by the synchronization point protocol of phase-1 end in a distributed transaction (synchronization point is achieved by "voting" (notifying local phase-1 end), and being propagated to the participants in a distributed transaction the same way as a decision point in atomic commitment; in analogy to decision point in CO, a conflicting operation in 2PL cannot happen before phase-1 end synchronization point, with the same resulting voting-deadlock in the case of a global data-access deadlock; the voting-deadlock (which is also a locking based global deadlock) is automatically resolved by the protocol aborting some transaction involved, with a missing vote, typically using a timeout

Timeout (telecommunication)

In telecommunication and related engineering , the term timeout or time-out has several meanings, including...

).

Comment:

When data are partitioned among the atomic commitment protocol (e.g., 2PC) participants, automatic global deadlock resolution has been overlooked in the database research literature, though deadlocks in such systems has been a quite intensive research area:

For CO and its special case SS2PL, the automatic resolution by the atomic commitment protocol has been noticed only in the CO articles. However, it has been noticed in practice that in many cases global deadlocks are very infrequently detected by the dedicated resolution mechanisms, less than could be expected ("Why do we see so few global deadlocks?"). The reason is probably the deadlocks that are automatically resolved and thus not handled and uncounted by the mechanisms;
For 2PL in general, the automatic resolution by the (mandatory) end-of-phase-one synchronization point protocol (which has same voting mechanism as atomic commitment protocol, and same missing vote handling upon voting deadlock, resulting in global deadlock resolution) has not been mentioned until today (2009). Practically only the special case SS2PL is utilized, where no end-of-phase-one synchronization is needed in addition to atomic commit protocol.

In a distributed environment where recoverable data are not partitioned among atomic commitment protocol participants, no such automatic resolution exists, and distributed deadlocks need to be resolved by dedicated techniques.

The source of this article is wikipedia, the free encyclopedia. The text of this article is licensed under the GFDL.