Multiprocessing
Encyclopedia
Multiprocessing is the use of two or more central processing units (CPUs) within a single computer system. The term also refers to the ability of a system to support more than one processor and/or the ability to allocate tasks between them. There are many variations on this basic theme, and the definition of multiprocessing can vary with context, mostly as a function of how CPUs are defined (multiple cores
on one die
, multiple dies in one package
, multiple packages in one system unit, etc.).
Multiprocessing sometimes refers to the execution of multiple concurrent software processes in a system as opposed to a single process at any one instant. However, the terms multitasking
or multiprogramming
are more appropriate to describe this concept, which is implemented mostly in software, whereas multiprocessing is more appropriate to describe the use of multiple hardware CPUs. A system can be both multiprocessing and multiprogramming, only one of the two, or neither of the two of them.
Systems that treat all CPUs equally are called symmetric multiprocessing
(SMP) systems. In systems where all CPUs are not equal, system resources may be divided in a number of ways, including asymmetric multiprocessing
(ASMP), non-uniform memory access
(NUMA) multiprocessing, and clustered multiprocessing.
or SIMD, often used in vector processing), multiple sequences of instructions in a single context (multiple-instruction, single-data
or MISD, used for redundancy
in fail-safe systems and sometimes applied to describe pipelined processors or hyper-threading
), or multiple sequences of instructions in multiple contexts (multiple-instruction, multiple-data
or MIMD).
), or may participate in a memory hierarchy with both local and shared memory (NUMA
). The IBM p690
Regatta is an example of a high end SMP system. Intel Xeon
processors dominated the multiprocessor market for business PCs and were the only x86 option until the release of AMD's Opteron
range of processors in 2004. Both ranges of processors had their own onboard cache but provided access to shared memory; the Xeon processors via a common pipe and the Opteron processors via independent pathways to the system RAM
.
Chip multiprocessors, also known as multi-core
computing, involves more than one processor placed on a single chip and can be thought of the most extreme form of tightly-coupled multiprocessing. Mainframe systems with multiple processors are often tightly-coupled.
Loosely-coupled multiprocessor systems (often referred to as clusters) are based on multiple standalone single or dual processor commodity computers interconnected via a high speed communication system (Gigabit Ethernet
is common). A Linux Beowulf cluster is an example of a loosely-coupled
system.
Tightly-coupled systems perform better and are physically smaller than loosely-coupled systems, but have historically required greater initial investments and may depreciate
rapidly; nodes in a loosely-coupled system are usually inexpensive commodity computers and can be recycled as independent machines upon retirement from the cluster.
Power consumption is also a consideration. Tightly-coupled systems tend to be much more energy efficient than clusters. This is because considerable economy can be realized by designing components to work together from the beginning in tightly-coupled systems, whereas loosely-coupled systems use components that were not necessarily intended specifically for use in such systems.
computer one processor sequentially processes instructions, each instruction processes one data item. One example is the "von Neumann" architecture with RISC.
computer one processor handles a stream of instructions, each one of which can perform calculations in parallel on multiple data locations.
SIMD multiprocessing is well suited to parallel or vector processing
, in which a very large set of data can be divided into parts that are individually subjected to identical but independent operations. A single instruction stream directs the operation of multiple processing units to perform the same manipulations simultaneously on potentially large amounts of data.
For certain types of computing applications, this type of architecture can produce enormous increases in performance, in terms of the elapsed time required to complete a given task. However, a drawback to this architecture is that a large part of the system falls idle when programs or system tasks are executed that cannot be divided into units that can be processed in parallel.
Additionally, programs must be carefully and specially written to take maximum advantage of the architecture, and often special optimizing compilers designed to produce code specifically for this environment must be used. Some compilers in this category provide special constructs or extensions to allow programmers to directly specify operations to be performed in parallel (e.g., DO FOR ALL statements in the version of FORTRAN
used on the ILLIAC IV
, which was a SIMD multiprocessing supercomputer
).
SIMD multiprocessing finds wide use in certain domains such as computer simulation
, but is of little use in general-purpose desktop and business computing environments.
Processing is divided into multiple threads, each with its own hardware processor state, within a single software-defined process or within multiple processes. Insofar as a system has multiple threads awaiting dispatch (either system or user threads), this architecture makes good use of hardware resources.
MIMD does raise issues of deadlock and resource contention, however, since threads may collide in their access to resources in an unpredictable way that is difficult to manage efficiently. MIMD requires special coding in the operating system of a computer but does not require application changes unless the programs themselves use multiple threads (MIMD is transparent to single-threaded programs under most operating systems, if the programs do not voluntarily relinquish control to the OS). Both system and user software may need to use software constructs such as semaphores
(also called locks or gates) to prevent one thread from interfering with another if they should happen to cross paths in referencing the same data. This gating or locking process increases code complexity, lowers performance, and greatly increases the amount of testing required, although not usually enough to negate the advantages of multiprocessing.
Similar conflicts can arise at the hardware level between processors (cache contention and corruption, for example), and must usually be resolved in hardware, or with a combination of software and hardware (e.g., cache-clear instructions).
Multi-core (computing)
A multi-core processor is a single computing component with two or more independent actual processors , which are the units that read and execute program instructions...
on one die
Die (integrated circuit)
A die in the context of integrated circuits is a small block of semiconducting material, on which a given functional circuit is fabricated.Typically, integrated circuits are produced in large batches on a single wafer of electronic-grade silicon or other semiconductor through processes such as...
, multiple dies in one package
Chip carrier
A chip carrier, also known as a chip container or chip package, is a container for a transistor or an integrated circuit. The carrier usually provides metal leads, or "pins", which are sturdy enough to electrically and mechanically connect the fragile chip to a circuit board. This connection may be...
, multiple packages in one system unit, etc.).
Multiprocessing sometimes refers to the execution of multiple concurrent software processes in a system as opposed to a single process at any one instant. However, the terms multitasking
Computer multitasking
In computing, multitasking is a method where multiple tasks, also known as processes, share common processing resources such as a CPU. In the case of a computer with a single CPU, only one task is said to be running at any point in time, meaning that the CPU is actively executing instructions for...
or multiprogramming
Multiprogramming
Computer multiprogramming is the allocation of a computer system and its resources to more than one concurrent application, job or user ....
are more appropriate to describe this concept, which is implemented mostly in software, whereas multiprocessing is more appropriate to describe the use of multiple hardware CPUs. A system can be both multiprocessing and multiprogramming, only one of the two, or neither of the two of them.
Processor symmetry
In a multiprocessing system, all CPUs may be equal, or some may be reserved for special purposes. A combination of hardware and operating-system software design considerations determine the symmetry (or lack thereof) in a given system. For example, hardware or software considerations may require that only one CPU respond to all hardware interrupts, whereas all other work in the system may be distributed equally among CPUs; or execution of kernel-mode code may be restricted to only one processor (either a specific processor, or only one processor at a time), whereas user-mode code may be executed in any combination of processors. Multiprocessing systems are often easier to design if such restrictions are imposed, but they tend to be less efficient than systems in which all CPUs are utilized.Systems that treat all CPUs equally are called symmetric multiprocessing
Symmetric multiprocessing
In computing, symmetric multiprocessing involves a multiprocessor computer hardware architecture where two or more identical processors are connected to a single shared main memory and are controlled by a single OS instance. Most common multiprocessor systems today use an SMP architecture...
(SMP) systems. In systems where all CPUs are not equal, system resources may be divided in a number of ways, including asymmetric multiprocessing
Asymmetric multiprocessing
Asymmetric multiprocessing, or AMP, was a software stopgap for handling multiple CPUs before symmetric multiprocessing, or SMP, was available.Multiprocessing is the use of more than one CPU in a computer system...
(ASMP), non-uniform memory access
Non-Uniform Memory Access
Non-Uniform Memory Access is a computer memory design used in Multiprocessing, where the memory access time depends on the memory location relative to a processor...
(NUMA) multiprocessing, and clustered multiprocessing.
Instruction and data streams
In multiprocessing, the processors can be used to execute a single sequence of instructions in multiple contexts (single-instruction, multiple-dataSIMD
Single instruction, multiple data , is a class of parallel computers in Flynn's taxonomy. It describes computers with multiple processing elements that perform the same operation on multiple data simultaneously...
or SIMD, often used in vector processing), multiple sequences of instructions in a single context (multiple-instruction, single-data
MISD
In computing, MISD is a type of parallel computing architecture where many functional units perform different operations on the same data. Pipeline architectures belong to this type, though a purist might say that the data is different after processing by each stage in the pipeline...
or MISD, used for redundancy
Redundancy (engineering)
In engineering, redundancy is the duplication of critical components or functions of a system with the intention of increasing reliability of the system, usually in the case of a backup or fail-safe....
in fail-safe systems and sometimes applied to describe pipelined processors or hyper-threading
Hyper-threading
Hyper-threading is Intel's term for its simultaneous multithreading implementation in its Atom, Intel Core i3/i5/i7, Itanium, Pentium 4 and Xeon CPUs....
), or multiple sequences of instructions in multiple contexts (multiple-instruction, multiple-data
MIMD
In computing, MIMD is a technique employed to achieve parallelism. Machines using MIMD have a number of processors that function asynchronously and independently. At any time, different processors may be executing different instructions on different pieces of data...
or MIMD).
Processor coupling
Tightly-coupled multiprocessor systems contain multiple CPUs that are connected at the bus level. These CPUs may have access to a central shared memory (SMP or UMAUniform Memory Access
Uniform Memory Access is a shared memory architecture used in parallel computers.All the processors in the UMA model share the physical memory uniformly...
), or may participate in a memory hierarchy with both local and shared memory (NUMA
Numa
Numa may refer to:* Numa Pompilius, legendary second king of Rome* Numa Morikazu, Meiji era Japanese politician* Numa, Iowa, U.S. town* The Numa Numa Internet meme* The Northern Paiute people...
). The IBM p690
IBM p690
The IBM p690 was, at the time of its release in late 2001, the flagship of IBM's high end Unix servers during the POWER4 era of processors. It was built to run IBM AIX Unix, although it is possible to run a version Linux minus some POWER4 specific features. It was discontinued in late 2005.It can...
Regatta is an example of a high end SMP system. Intel Xeon
Xeon
The Xeon is a brand of multiprocessing- or multi-socket-capable x86 microprocessors from Intel Corporation targeted at the non-consumer server, workstation and embedded system markets.-Overview:...
processors dominated the multiprocessor market for business PCs and were the only x86 option until the release of AMD's Opteron
Opteron
Opteron is AMD's x86 server and workstation processor line, and was the first processor which supported the AMD64 instruction set architecture . It was released on April 22, 2003 with the SledgeHammer core and was intended to compete in the server and workstation markets, particularly in the same...
range of processors in 2004. Both ranges of processors had their own onboard cache but provided access to shared memory; the Xeon processors via a common pipe and the Opteron processors via independent pathways to the system RAM
Ram
-Animals:*Ram, an uncastrated male sheep*Ram cichlid, a species of freshwater fish endemic to Colombia and Venezuela-Military:*Battering ram*Ramming, a military tactic in which one vehicle runs into another...
.
Chip multiprocessors, also known as multi-core
Multi-core (computing)
A multi-core processor is a single computing component with two or more independent actual processors , which are the units that read and execute program instructions...
computing, involves more than one processor placed on a single chip and can be thought of the most extreme form of tightly-coupled multiprocessing. Mainframe systems with multiple processors are often tightly-coupled.
Loosely-coupled multiprocessor systems (often referred to as clusters) are based on multiple standalone single or dual processor commodity computers interconnected via a high speed communication system (Gigabit Ethernet
Gigabit Ethernet
Gigabit Ethernet is a term describing various technologies for transmitting Ethernet frames at a rate of a gigabit per second , as defined by the IEEE 802.3-2008 standard. It came into use beginning in 1999, gradually supplanting Fast Ethernet in wired local networks where it performed...
is common). A Linux Beowulf cluster is an example of a loosely-coupled
Loose coupling
In computing and systems design a loosely coupled system is one where each of its components has, or makes use of, little or no knowledge of the definitions of other separate components. The notion was introduced into organizational studies by Karl Weick...
system.
Tightly-coupled systems perform better and are physically smaller than loosely-coupled systems, but have historically required greater initial investments and may depreciate
Depreciation
Depreciation refers to two very different but related concepts:# the decrease in value of assets , and# the allocation of the cost of assets to periods in which the assets are used ....
rapidly; nodes in a loosely-coupled system are usually inexpensive commodity computers and can be recycled as independent machines upon retirement from the cluster.
Power consumption is also a consideration. Tightly-coupled systems tend to be much more energy efficient than clusters. This is because considerable economy can be realized by designing components to work together from the beginning in tightly-coupled systems, whereas loosely-coupled systems use components that were not necessarily intended specifically for use in such systems.
SISD multiprocessing
In a single instruction stream, single data streamSISD
In computing, SISD is a term referring to a computer architecture in which a single processor, a uniprocessor, executes a single instruction stream, to operate on data stored in a single memory. This corresponds to the von Neumann architecture.SISD is one of the four main classifications as...
computer one processor sequentially processes instructions, each instruction processes one data item. One example is the "von Neumann" architecture with RISC.
SIMD multiprocessing
In a single instruction stream, multiple data streamSIMD
Single instruction, multiple data , is a class of parallel computers in Flynn's taxonomy. It describes computers with multiple processing elements that perform the same operation on multiple data simultaneously...
computer one processor handles a stream of instructions, each one of which can perform calculations in parallel on multiple data locations.
SIMD multiprocessing is well suited to parallel or vector processing
Parallel processing
Parallel processing is the ability to carry out multiple operations or tasks simultaneously. The term is used in the contexts of both human cognition, particularly in the ability of the brain to simultaneously process incoming stimuli, and in parallel computing by machines.-Parallel processing by...
, in which a very large set of data can be divided into parts that are individually subjected to identical but independent operations. A single instruction stream directs the operation of multiple processing units to perform the same manipulations simultaneously on potentially large amounts of data.
For certain types of computing applications, this type of architecture can produce enormous increases in performance, in terms of the elapsed time required to complete a given task. However, a drawback to this architecture is that a large part of the system falls idle when programs or system tasks are executed that cannot be divided into units that can be processed in parallel.
Additionally, programs must be carefully and specially written to take maximum advantage of the architecture, and often special optimizing compilers designed to produce code specifically for this environment must be used. Some compilers in this category provide special constructs or extensions to allow programmers to directly specify operations to be performed in parallel (e.g., DO FOR ALL statements in the version of FORTRAN
Fortran
Fortran is a general-purpose, procedural, imperative programming language that is especially suited to numeric computation and scientific computing...
used on the ILLIAC IV
ILLIAC IV
The ILLIAC IV was one of the most infamous supercomputers ever built. One of a series of research machines, the ILLIACs from the University of Illinois, the ILLIAC IV design featured fairly high parallelism with up to 256 processors, used to allow the machine to work on large data sets in what...
, which was a SIMD multiprocessing supercomputer
Supercomputer
A supercomputer is a computer at the frontline of current processing capacity, particularly speed of calculation.Supercomputers are used for highly calculation-intensive tasks such as problems including quantum physics, weather forecasting, climate research, molecular modeling A supercomputer is a...
).
SIMD multiprocessing finds wide use in certain domains such as computer simulation
Computer simulation
A computer simulation, a computer model, or a computational model is a computer program, or network of computers, that attempts to simulate an abstract model of a particular system...
, but is of little use in general-purpose desktop and business computing environments.
MISD multiprocessing
MISD multiprocessing offers mainly the advantage of redundancy, since multiple processing units perform the same tasks on the same data, reducing the chances of incorrect results if one of the units fails. MISD architectures may involve comparisons between processing units to detect failures. Apart from the redundant and fail-safe character of this type of multiprocessing, it has few advantages, and it is very expensive. It does not improve performance. It can be implemented in a way that is transparent to software. It is used in array processors and is implemented in fault tolerant machines.MIMD multiprocessing
MIMD multiprocessing architecture is suitable for a wide variety of tasks in which completely independent and parallel execution of instructions touching different sets of data can be put to productive use. For this reason, and because it is easy to implement, MIMD predominates in multiprocessing.Processing is divided into multiple threads, each with its own hardware processor state, within a single software-defined process or within multiple processes. Insofar as a system has multiple threads awaiting dispatch (either system or user threads), this architecture makes good use of hardware resources.
MIMD does raise issues of deadlock and resource contention, however, since threads may collide in their access to resources in an unpredictable way that is difficult to manage efficiently. MIMD requires special coding in the operating system of a computer but does not require application changes unless the programs themselves use multiple threads (MIMD is transparent to single-threaded programs under most operating systems, if the programs do not voluntarily relinquish control to the OS). Both system and user software may need to use software constructs such as semaphores
Semaphore (programming)
In computer science, a semaphore is a variable or abstract data type that provides a simple but useful abstraction for controlling access by multiple processes to a common resource in a parallel programming environment....
(also called locks or gates) to prevent one thread from interfering with another if they should happen to cross paths in referencing the same data. This gating or locking process increases code complexity, lowers performance, and greatly increases the amount of testing required, although not usually enough to negate the advantages of multiprocessing.
Similar conflicts can arise at the hardware level between processors (cache contention and corruption, for example), and must usually be resolved in hardware, or with a combination of software and hardware (e.g., cache-clear instructions).
See also
- 3B20C
- Symmetric multiprocessingSymmetric multiprocessingIn computing, symmetric multiprocessing involves a multiprocessor computer hardware architecture where two or more identical processors are connected to a single shared main memory and are controlled by a single OS instance. Most common multiprocessor systems today use an SMP architecture...
- Symmetric multiprocessor
- Asymmetric multiprocessingAsymmetric multiprocessingAsymmetric multiprocessing, or AMP, was a software stopgap for handling multiple CPUs before symmetric multiprocessing, or SMP, was available.Multiprocessing is the use of more than one CPU in a computer system...
- Multi-core (computing)Multi-core (computing)A multi-core processor is a single computing component with two or more independent actual processors , which are the units that read and execute program instructions...
- BMDFMBMDFMBMDFM is software, which enables running an application in parallel on shared memory symmetric multiprocessors using the multiple processors to speed up the execution of single applications....
- Binary Modular Dataflow Machine, a SMP MIMD Runtime Environment - Software lockoutSoftware lockoutIn multiprocessor computer systems, software lockout is the issue of performance degradation due to the idle wait times spent by the CPUs in kernel-level critical sections. Software lockout is the major cause of scalability degradation in a multiprocessor system, posing a limit on the maximum...
- OpenHMPP HPC Open Standard for Manycore Programming