Scalable parallelism
Encyclopedia
Software is said to exhibit scalable parallelism if it can make use of additional processors to solve larger problems,
i.e. this term refers to software for which Gustafson's law
holds.
Consider a program whose execution time is dominated by one or more loops,
each of that updates every element of an array ---
for example, the following finite difference
heat equation
stencil calculation
:
for t := 0 to T do
for i := 1 to N-1 do
new(i) := (A(i-1) + A(i) + A(i) + A(i+1)) * .25
// explicit forward-difference with R = 0.25
end
for i := 1 to N-1 do
A(i) := new(i)
end
end
In the above code, we can execute all iterations of each "i" loop concurrently,
i.e., turn each into a parallel loop
.
In such cases,
it is often possible to make effective use of twice as many processors for a problem of array size 2N
as for a problem of array size N.
As in this example, scalable parallelism is typically a form of data parallelism
.
This form of parallelism is often the target of automatic parallelization
of loops
.
Distributed computing systems
and non-uniform memory access
architectures
are typically the most easily scaled to large numbers of processors,
and thus would seem a natural target for software that exhibits scalable parallelism.
However, applications with scalable parallelism may not have parallelism of
sufficiently coarse grain
to run effectively on such systems (unless the software is embarrassingly parallel).
In our example above, the second "i" loop is embarrassingly parallel,
but in the first loop each iteration requires results produced in several prior iterations.
Thus, for the first loop, parallelization may involve extensive communication or synchronization among processors,
and thus only result in a net speedup if such interactions have very low overhead,
or if the code can be transformed to resolve this issue (i.e., by combined scalable locality
/scalable parallelism optimization).
an extension of Java making Scalable Parallelism possible on the Java Virtual Machine (JVM)
i.e. this term refers to software for which Gustafson's law
Gustafson's law
Gustafson's Law is a law in computer science which says that problems with large, repetitive data sets can be efficiently parallelized. Gustafson's Law contradicts Amdahl's law, which describes a limit on the speed-up that parallelization can provide. Gustafson's law was first described by John...
holds.
Consider a program whose execution time is dominated by one or more loops,
each of that updates every element of an array ---
for example, the following finite difference
Finite difference method
In mathematics, finite-difference methods are numerical methods for approximating the solutions to differential equations using finite difference equations to approximate derivatives.- Derivation from Taylor's polynomial :...
heat equation
Heat equation
The heat equation is an important partial differential equation which describes the distribution of heat in a given region over time...
stencil calculation
Stencil (numerical analysis)
In mathematics, especially the areas of numerical analysis concentrating on the numerical solution of partial differential equations, a stencil is a geometric arrangement of a nodal group that relate to the point of interest by using a numerical approximation routine. Stencils are the basis for...
:
for t := 0 to T do
for i := 1 to N-1 do
new(i) := (A(i-1) + A(i) + A(i) + A(i+1)) * .25
// explicit forward-difference with R = 0.25
end
for i := 1 to N-1 do
A(i) := new(i)
end
end
In the above code, we can execute all iterations of each "i" loop concurrently,
i.e., turn each into a parallel loop
Data parallelism
Data parallelism is a form of parallelization of computing across multiple processors in parallel computing environments. Data parallelism focuses on distributing the data across different parallel computing nodes...
.
In such cases,
it is often possible to make effective use of twice as many processors for a problem of array size 2N
as for a problem of array size N.
As in this example, scalable parallelism is typically a form of data parallelism
Data parallelism
Data parallelism is a form of parallelization of computing across multiple processors in parallel computing environments. Data parallelism focuses on distributing the data across different parallel computing nodes...
.
This form of parallelism is often the target of automatic parallelization
Automatic parallelization
Automatic parallelization, also auto parallelization, autoparallelization, or parallelization, the last one of which implies automation when used in context, refers to converting sequential code into multi-threaded or vectorized code in order to utilize multiple processors simultaneously in a...
of loops
Loop optimization
In compiler theory, loop optimization plays an important role in improving cache performance, making effective use of parallel processing capabilities, and reducing overheads associated with executing loops. Most execution time of a scientific program is spent on loops...
.
Distributed computing systems
Distributed computing
Distributed computing is a field of computer science that studies distributed systems. A distributed system consists of multiple autonomous computers that communicate through a computer network. The computers interact with each other in order to achieve a common goal...
and non-uniform memory access
Non-Uniform Memory Access
Non-Uniform Memory Access is a computer memory design used in Multiprocessing, where the memory access time depends on the memory location relative to a processor...
architectures
are typically the most easily scaled to large numbers of processors,
and thus would seem a natural target for software that exhibits scalable parallelism.
However, applications with scalable parallelism may not have parallelism of
sufficiently coarse grain
to run effectively on such systems (unless the software is embarrassingly parallel).
In our example above, the second "i" loop is embarrassingly parallel,
but in the first loop each iteration requires results produced in several prior iterations.
Thus, for the first loop, parallelization may involve extensive communication or synchronization among processors,
and thus only result in a net speedup if such interactions have very low overhead,
or if the code can be transformed to resolve this issue (i.e., by combined scalable locality
Scalable locality
Software is said to exhibit scalable locality if it can continue to make use of processors that out-pace their memory systems, to solve ever larger problems....
/scalable parallelism optimization).
Languages
Ateji PXAteji PX
Ateji PX is an object-oriented programming language extension for Java. It is intended to facilliate parallel computing on multi-core processors, GPU, Grid and Cloud....
an extension of Java making Scalable Parallelism possible on the Java Virtual Machine (JVM)