Dryad (programming)
Encyclopedia
Dryad is an ongoing research project at Microsoft Research
for a general purpose runtime for execution of data parallel applications. An application written for Dryad is modeled as a directed acyclic graph
(DAG). The DAG defines the dataflow
of the application, and the vertices of the graph defines the operations that are to be performed on the data. The "computational vertices" are written using sequential constructs, devoid of any concurrency
or mutual exclusion
semantics. The Dryad runtime parallelizes the dataflow graph by distributing the computational vertices across various execution engines (which can be multiple processor cores on the same computer or different physical computers connected by a network, as in a cluster
). Scheduling of the computational vertices on the available hardware is handled by the Dryad runtime, without any explicit intervention by the developer of the application or administrator of the network. The flow of data between one computational vertex to another is implemented by using communication "channels" between the vertices, which in physical implementation is realized by TCP/IP streams, shared memory
or temporary files. A stream is used at runtime to transport a finite number of structured
Items.
Dryad defines a domain-specific language, which is implemented via a C++
library, that is used to create and model a Dryad execution graph. Computational vertices are written using standard C++ constructs. To make them accessible to the Dryad runtime, they must be encapsulated in a class that inherits
from the
wrappers for the Dryad API can also be written.
There exist several high-level language compilers which use Dryad as a runtime; examples include PSQL, Microsoft Scope and DryadLINQ.
Microsoft Research
Microsoft Research is the research division of Microsoft created in 1991 for developing various computer science ideas and integrating them into Microsoft products. It currently employs Turing Award winners C.A.R. Hoare, Butler Lampson, and Charles P...
for a general purpose runtime for execution of data parallel applications. An application written for Dryad is modeled as a directed acyclic graph
Directed acyclic graph
In mathematics and computer science, a directed acyclic graph , is a directed graph with no directed cycles. That is, it is formed by a collection of vertices and directed edges, each edge connecting one vertex to another, such that there is no way to start at some vertex v and follow a sequence of...
(DAG). The DAG defines the dataflow
Dataflow
Dataflow is a term used in computing, and may have various shades of meaning. It is closely related to message passing.-Software architecture:...
of the application, and the vertices of the graph defines the operations that are to be performed on the data. The "computational vertices" are written using sequential constructs, devoid of any concurrency
Concurrent computing
Concurrent computing is a form of computing in which programs are designed as collections of interacting computational processes that may be executed in parallel...
or mutual exclusion
Mutual exclusion
Mutual exclusion algorithms are used in concurrent programming to avoid the simultaneous use of a common resource, such as a global variable, by pieces of computer code called critical sections. A critical section is a piece of code in which a process or thread accesses a common resource...
semantics. The Dryad runtime parallelizes the dataflow graph by distributing the computational vertices across various execution engines (which can be multiple processor cores on the same computer or different physical computers connected by a network, as in a cluster
Cluster Computing
Cluster Computing: the Journal of Networks, Software Tools and Applications is a journal for parallel processing, distributed computing systems, and computer communication networks....
). Scheduling of the computational vertices on the available hardware is handled by the Dryad runtime, without any explicit intervention by the developer of the application or administrator of the network. The flow of data between one computational vertex to another is implemented by using communication "channels" between the vertices, which in physical implementation is realized by TCP/IP streams, shared memory
Shared memory
In computing, shared memory is memory that may be simultaneously accessed by multiple programs with an intent to provide communication among them or avoid redundant copies. Depending on context, programs may run on a single processor or on multiple separate processors...
or temporary files. A stream is used at runtime to transport a finite number of structured
Data structure
In computer science, a data structure is a particular way of storing and organizing data in a computer so that it can be used efficiently.Different kinds of data structures are suited to different kinds of applications, and some are highly specialized to specific tasks...
Items.
Dryad defines a domain-specific language, which is implemented via a C++
C++
C++ is a statically typed, free-form, multi-paradigm, compiled, general-purpose programming language. It is regarded as an intermediate-level language, as it comprises a combination of both high-level and low-level language features. It was developed by Bjarne Stroustrup starting in 1979 at Bell...
library, that is used to create and model a Dryad execution graph. Computational vertices are written using standard C++ constructs. To make them accessible to the Dryad runtime, they must be encapsulated in a class that inherits
Inheritance
Inheritance is the practice of passing on property, titles, debts, rights and obligations upon the death of an individual. It has long played an important role in human societies...
from the
GraphNode
base class. The graph is defined by adding edges; edges are added by using a composition operator (defined by Dryad) that connects two graphs (or two nodes of a graph) with an edge. Managed codeManaged code
Managed code is a term coined by Microsoft to identify computer program code that requires and will only execute under the "management" of a Common Language Runtime virtual machine ....
wrappers for the Dryad API can also be written.
There exist several high-level language compilers which use Dryad as a runtime; examples include PSQL, Microsoft Scope and DryadLINQ.