Thread-safe
Encyclopedia
Thread safety is a computer programming
concept applicable in the context of multi-threaded
programs. A piece of code is thread-safe if it only manipulates shared data structures in a thread-safe manner, which enables safe execution by multiple threads at the same time. There are various strategies for making thread-safe data structures .
Software libraries can provide certain thread-safety guarantees. For example, concurrent reads might be guaranteed to be thread-safe, but concurrent writes might not be. Whether or not a program using such a library is thread-safe depends on whether it uses the library in a manner consistent with those guarantees.
A key challenge in multi-threaded programming, thread safety was not a concern for most application developers until the 1990s when operating systems began to expose multiple threads for code execution. Today, a program may execute code on several threads simultaneously in a shared address space
where each of those threads have access to virtually all of the memory
of every other thread. Thus the flow of control and the sequence of accesses to data often have little relation to what would be reasonably expected by looking at the text of the program, violating the principle of least astonishment
. Thread safety is a property that allows code to run in multi-threaded environments by re-establishing some of the correspondences between the actual flow of control and the text of the program, by means of synchronization
.
All this can be subsumed under manipulation of global state, which is not neccessarily restricted to a single program/process.
Re-entrancy : Writing code in such a way that it can be partially executed by a thread, reexecuted by the same thread or simultaneously executed by another thread and still correctly complete the original execution. This requires the saving of state
information in variables local to each execution, usually on a stack, instead of in static
or global
variables or other non-local state. There are still some rare cases where non-local state can be used in a reentrant function, if the access is done through atomic operations and the data-structure is re-entrancy safe.
Mutual exclusion
or Process synchronization: Access to shared data is serialized using mechanisms that ensure only one thread reads or writes the shared data at any time. Great care is required if a piece of code accesses multiple shared pieces of data—problems include race condition
s, deadlock
s, livelocks and starvation
.
Thread-local storage
: Variables are localized so that each thread has its own private copy. These variables retain their values across subroutine
and other code boundaries, and are thread-safe since they are local to each thread, even though the code which accesses them might be executed simultaneously by another thread.
Atomic operations
: Shared data are accessed by using atomic operations
which cannot be interrupted by other threads. This usually requires using special machine language instructions, which might be available in a runtime library
. Since the operations are atomic, the shared data are always kept in a valid state, no matter what other threads access it. Atomic operations
form the basis of many thread locking mechanisms.
code, the function is thread-safe, but not reentrant:
In the above,
The following piece of C
code, presents a less obvious situation where a thread is using a file that another thread or process might delete.
In the above, the function is thread-safe, as it can be called from any number of threads and will not fail. But all the calls should be in a controlled environment. If executed in a multi-process environment, or if the file is stored on a network-shared drive, there is no warranty that it won't be deleted.
, and little time is spent serialized.
Computer programming
Computer programming is the process of designing, writing, testing, debugging, and maintaining the source code of computer programs. This source code is written in one or more programming languages. The purpose of programming is to create a program that performs specific operations or exhibits a...
concept applicable in the context of multi-threaded
Thread (computer science)
In computer science, a thread of execution is the smallest unit of processing that can be scheduled by an operating system. The implementation of threads and processes differs from one operating system to another, but in most cases, a thread is contained inside a process...
programs. A piece of code is thread-safe if it only manipulates shared data structures in a thread-safe manner, which enables safe execution by multiple threads at the same time. There are various strategies for making thread-safe data structures .
Software libraries can provide certain thread-safety guarantees. For example, concurrent reads might be guaranteed to be thread-safe, but concurrent writes might not be. Whether or not a program using such a library is thread-safe depends on whether it uses the library in a manner consistent with those guarantees.
A key challenge in multi-threaded programming, thread safety was not a concern for most application developers until the 1990s when operating systems began to expose multiple threads for code execution. Today, a program may execute code on several threads simultaneously in a shared address space
Address space
In computing, an address space defines a range of discrete addresses, each of which may correspond to a network host, peripheral device, disk sector, a memory cell or other logical or physical entity.- Overview :...
where each of those threads have access to virtually all of the memory
Computer storage
Computer data storage, often called storage or memory, refers to computer components and recording media that retain digital data. Data storage is one of the core functions and fundamental components of computers....
of every other thread. Thus the flow of control and the sequence of accesses to data often have little relation to what would be reasonably expected by looking at the text of the program, violating the principle of least astonishment
Principle of least astonishment
The principle of least astonishment applies to user interface design, software design, and ergonomics. It is alternatively referred to as the rule or law of least astonishment, or the rule or principle of least surprise .The POLA states that, when two elements of an interface conflict, or are...
. Thread safety is a property that allows code to run in multi-threaded environments by re-establishing some of the correspondences between the actual flow of control and the text of the program, by means of synchronization
Synchronization (computer science)
In computer science, synchronization refers to one of two distinct but related concepts: synchronization of processes, and synchronization of data. Process synchronization refers to the idea that multiple processes are to join up or handshake at a certain point, so as to reach an agreement or...
.
Identification
It is not easy to determine whether or not a piece of code is thread-safe. However, there are several indicators that suggest the need for careful examination to see if it is unsafe:- accessing global variableGlobal variableIn computer programming, a global variable is a variable that is accessible in every scope . Interaction mechanisms with global variables are called global environment mechanisms...
s or the heap - allocating/reallocating/freeing resourcesResource (computer science)A resource, or system resource, is any physical or virtual component of limited availability within a computer system. Every device connected to a computer system is a resource. Every internal system component is a resource...
that have global scope (fileComputer fileA computer file is a block of arbitrary information, or resource for storing information, which is available to a computer program and is usually based on some kind of durable storage. A file is durable in the sense that it remains available for programs to use after the current program has finished...
s, sub-processesProcess (computing)In computing, a process is an instance of a computer program that is being executed. It contains the program code and its current activity. Depending on the operating system , a process may be made up of multiple threads of execution that execute instructions concurrently.A computer program is a...
, pipesPipeline (Unix)In Unix-like computer operating systems , a pipeline is the original software pipeline: a set of processes chained by their standard streams, so that the output of each process feeds directly as input to the next one. Each connection is implemented by an anonymous pipe...
, etc.) - indirect accesses through handles or pointers
- any visible side-effect (e.g., access to volatile variableVolatile variableIn computer programming, particularly in the C, C++, C#, and Java programming languages, a variable or object declared with the volatile keyword usually has special properties related to optimization and/or threading...
s in the C programming languageC (programming language)C is a general-purpose computer programming language developed between 1969 and 1973 by Dennis Ritchie at the Bell Telephone Laboratories for use with the Unix operating system....
)
All this can be subsumed under manipulation of global state, which is not neccessarily restricted to a single program/process.
Implementation
There are a few ways to achieve thread safety:Re-entrancy : Writing code in such a way that it can be partially executed by a thread, reexecuted by the same thread or simultaneously executed by another thread and still correctly complete the original execution. This requires the saving of state
State (computer science)
In computer science and automata theory, a state is a unique configuration of information in a program or machine. It is a concept that occasionally extends into some forms of systems programming such as lexers and parsers....
information in variables local to each execution, usually on a stack, instead of in static
Static variable
In computer programming, a static variable is a variable that has been allocated statically — whose lifetime extends across the entire run of the program...
or global
Global variable
In computer programming, a global variable is a variable that is accessible in every scope . Interaction mechanisms with global variables are called global environment mechanisms...
variables or other non-local state. There are still some rare cases where non-local state can be used in a reentrant function, if the access is done through atomic operations and the data-structure is re-entrancy safe.
Mutual exclusion
Mutual exclusion
Mutual exclusion algorithms are used in concurrent programming to avoid the simultaneous use of a common resource, such as a global variable, by pieces of computer code called critical sections. A critical section is a piece of code in which a process or thread accesses a common resource...
or Process synchronization: Access to shared data is serialized using mechanisms that ensure only one thread reads or writes the shared data at any time. Great care is required if a piece of code accesses multiple shared pieces of data—problems include race condition
Race condition
A race condition or race hazard is a flaw in an electronic system or process whereby the output or result of the process is unexpectedly and critically dependent on the sequence or timing of other events...
s, deadlock
Deadlock
A deadlock is a situation where in two or more competing actions are each waiting for the other to finish, and thus neither ever does. It is often seen in a paradox like the "chicken or the egg"...
s, livelocks and starvation
Resource starvation
In computer science, starvation is a multitasking-related problem, where a process is perpetually denied necessary resources. Without those resources, the program can never finish its task....
.
Thread-local storage
Thread-local storage
Thread-local storage is a computer programming method that uses static or global memory local to a thread.This is sometimes needed because normally all threads in a process share the same address space, which is sometimes undesirable...
: Variables are localized so that each thread has its own private copy. These variables retain their values across subroutine
Subroutine
In computer science, a subroutine is a portion of code within a larger program that performs a specific task and is relatively independent of the remaining code....
and other code boundaries, and are thread-safe since they are local to each thread, even though the code which accesses them might be executed simultaneously by another thread.
Atomic operations
Linearizability
In concurrent programming, an operation is atomic, linearizable, indivisible or uninterruptible if it appears to the rest of the system to occur instantaneously. Atomicity is a guarantee of isolation from concurrent processes...
: Shared data are accessed by using atomic operations
Atomicity
In database systems, atomicity is one of the ACID transaction properties. In an atomic transaction, a series of database operations either all occur, or nothing occurs...
which cannot be interrupted by other threads. This usually requires using special machine language instructions, which might be available in a runtime library
Runtime library
In computer programming, a runtime library is a special program library used by a compiler, to implement functions built into a programming language, during the execution of a computer program...
. Since the operations are atomic, the shared data are always kept in a valid state, no matter what other threads access it. Atomic operations
Linearizability
In concurrent programming, an operation is atomic, linearizable, indivisible or uninterruptible if it appears to the rest of the system to occur instantaneously. Atomicity is a guarantee of isolation from concurrent processes...
form the basis of many thread locking mechanisms.
Examples
In the following piece of CC (programming language)
C is a general-purpose computer programming language developed between 1969 and 1973 by Dennis Ritchie at the Bell Telephone Laboratories for use with the Unix operating system....
code, the function is thread-safe, but not reentrant:
In the above,
function
can be called by different threads without any problem. But if the function is used in a reentrant interrupt handler and a second interrupt arises inside the function, the second routine will hang forever. As interrupt servicing can disable other interrupts, the whole system could suffer.Concurrent programming
Note that a piece of code can be thread safe, and yet unable to run at the same time that some other piece of code is running. A trivial example of that is when that other piece of code restarts the computer.The following piece of C
C (programming language)
C is a general-purpose computer programming language developed between 1969 and 1973 by Dennis Ritchie at the Bell Telephone Laboratories for use with the Unix operating system....
code, presents a less obvious situation where a thread is using a file that another thread or process might delete.
In the above, the function is thread-safe, as it can be called from any number of threads and will not fail. But all the calls should be in a controlled environment. If executed in a multi-process environment, or if the file is stored on a network-shared drive, there is no warranty that it won't be deleted.
Difficulties
One approach to making data thread-safe that combines several of the above elements is to make changes atomically to update the shared data. Thus, most of the code is concurrentConcurrency (computer science)
In computer science, concurrency is a property of systems in which several computations are executing simultaneously, and potentially interacting with each other...
, and little time is spent serialized.
See also
- Control flow analysisControl flow analysisControl flow analysis is a static code analysis technique for determining the control flow of a program. The control flow is expressed as a control flow graph .For many languages, the control flow of a program is explicit in a program's source code...
- Priority inversionPriority inversionIn computer science, priority inversion is a problematic scenario in scheduling when a higher priority task is indirectly preempted by a lower priority task effectively "inverting" the relative priorities of the two tasks....
- Concurrency controlConcurrency controlIn information technology and computer science, especially in the fields of computer programming , operating systems , multiprocessors, and databases, concurrency control ensures that correct results for concurrent operations are generated, while getting those results as quickly as possible.Computer...
- Exception safetyException handlingException handling is a programming language construct or computer hardware mechanism designed to handle the occurrence of exceptions, special conditions that change the normal flow of program execution....
- Communicating sequential processesCommunicating sequential processesIn computer science, Communicating Sequential Processes is a formal language for describing patterns of interaction in concurrent systems. It is a member of the family of mathematical theories of concurrency known as process algebras, or process calculi...
- a technique for analyzing concurrency
External links
- Thread-safe Tcl Extensions (wiki page)
- Thread-safe design
- Article "Design for thread safety" by Bill Venners
- Article "Write thread-safe servlets" by Phillip Bridgham
- Article "Smart Pointer Thread Safety" by Dejan Jelovic
- A Short Guide to Mastering Thread-Safety