DIET
Encyclopedia
DIET is a piece of software for grid-computing. As middleware
, DIET sits between the operating system
(which handles the details of the hardware
) and the application software
(which deals with the specific computational task at hand). DIET was created in 2000. It was designed for high-performance computing. It is currently developed by INRIA, École Normale Supérieure de Lyon
, CNRS, Claude Bernard University Lyon 1, SysFera. It is open-source software
released under the CeCILL
license.
Like NetSolve/GridSolve and Ninf, DIET is compliant with the GridRPC
standard from the Open Grid Forum
.
The aim of the DIET project is to develop a set of tools to build computational servers. The distributed resources are managed in a transparent way through the middleware. It can work with workstations, clusters, Grids
and Clouds
.
DIET is used to manage the Décrypthon
Grid installed by IBM
in 6 French universities (Bordeaux 1
, Lille 1, Paris 6
, ENS Lyon
, Crihan in Rouen, Orsay
).
DIET's architecture follows a different design. It is composed of:
. This entity can work in two modes: one in which it defines a complete scheduling of the workflow (ordering and mapping), and one in which it defines only an ordering for the workflow execution. Mapping is then done in the next step by the client, using the Master Agent to find the server where the workflow services should be run.
CoRI generates a basic set of performance-estimation values which are stored in the estimation vector and identified by system-defined tags. Information such as the number of cores, the total memory, the number of bogomips, and hard drive speed, etc., which are static, as well as dynamic information like the predicted time to solve a problem on the given resource, the average CPU load, is thus transferred from the Server Daemon to the scheduler agent in order to provide pertinent information for a better scheduling. As mentioned above, these are used in correlation with the application-driven scheduler possibility in DIET: the Server Daemon, which has a better understanding of the application needs, can request for a specific scheduling relaying on the information stored in this vector.
on IBM
resources, OpenPBS which is a fork of the well-known PBS
system, and OAR developed by IMAG at Grenoble, and used on the Grid'5000 research grid. Most of the submitted jobs are parallel jobs, coded using the MPI standard with an instantiation such as MPICH or LAM.
Middleware
Middleware is computer software that connects software components or people and their applications. The software consists of a set of services that allows multiple processes running on one or more machines to interact...
, DIET sits between the operating system
Operating system
An operating system is a set of programs that manage computer hardware resources and provide common services for application software. The operating system is the most important type of system software in a computer system...
(which handles the details of the hardware
Hardware
Hardware is a general term for equipment such as keys, locks, hinges, latches, handles, wire, chains, plumbing supplies, tools, utensils, cutlery and machine parts. Household hardware is typically sold in hardware stores....
) and the application software
Application software
Application software, also known as an application or an "app", is computer software designed to help the user to perform specific tasks. Examples include enterprise software, accounting software, office suites, graphics software and media players. Many application programs deal principally with...
(which deals with the specific computational task at hand). DIET was created in 2000. It was designed for high-performance computing. It is currently developed by INRIA, École Normale Supérieure de Lyon
École Normale Supérieure de Lyon
The École Normale Supérieure de Lyon is a highly selective grande école located in Lyon, France. As one of France's three Écoles normales supérieures, ENS Lyon is associated with a strong French tradition of excellence and public service...
, CNRS, Claude Bernard University Lyon 1, SysFera. It is open-source software
Open-source software
Open-source software is computer software that is available in source code form: the source code and certain other rights normally reserved for copyright holders are provided under a software license that permits users to study, change, improve and at times also to distribute the software.Open...
released under the CeCILL
CeCILL
CeCILL is a free software license adapted to both international and French legal matters, in the spirit of and retaining compatibility with the GNU General Public License....
license.
Like NetSolve/GridSolve and Ninf, DIET is compliant with the GridRPC
GridRPC
GridRPC is Remote Procedure Call over the Grid. This paradigm has been proposed by the GridRPC working group of the Open Grid Forum , and an API has been defined in order for clients to access remote servers as simply as a function call...
standard from the Open Grid Forum
Open Grid Forum
The Open Grid Forum is a community of users, developers, and vendors for standardization of grid computing. It was formed in 2006 in a merger of the Global Grid Forum and the Enterprise Grid Alliance. The OGSA, OGSI, and JSDL standards were created by the OGF...
.
The aim of the DIET project is to develop a set of tools to build computational servers. The distributed resources are managed in a transparent way through the middleware. It can work with workstations, clusters, Grids
Grid computing
Grid computing is a term referring to the combination of computer resources from multiple administrative domains to reach a common goal. The grid can be thought of as a distributed system with non-interactive workloads that involve a large number of files...
and Clouds
Cloud computing
Cloud computing is the delivery of computing as a service rather than a product, whereby shared resources, software, and information are provided to computers and other devices as a utility over a network ....
.
DIET is used to manage the Décrypthon
Décrypthon
Décrypthon is a project which uses grid computing resources to contribute to medical research. The word is a portmanteau of the French word "décrypter" and "telethon".- Description:...
Grid installed by IBM
IBM
International Business Machines Corporation or IBM is an American multinational technology and consulting corporation headquartered in Armonk, New York, United States. IBM manufactures and sells computer hardware and software, and it offers infrastructure, hosting and consulting services in areas...
in 6 French universities (Bordeaux 1
University of Bordeaux 1
The University of Bordeaux 1 is a French university, in the Academy of Bordeaux. Its main campus is in Talence.It has many important laboratories such as* Centre de Neurosciences Intégratives et Cognitives , a neuroscience research center...
, Lille 1, Paris 6
Pierre and Marie Curie University
The Paris VI University , or the Pierre and Marie Curie University , is a university located on the Jussieu Campus in the Latin Quarter of the 5th arrondissement of Paris, France....
, ENS Lyon
École Normale Supérieure de Lyon
The École Normale Supérieure de Lyon is a highly selective grande école located in Lyon, France. As one of France's three Écoles normales supérieures, ENS Lyon is associated with a strong French tradition of excellence and public service...
, Crihan in Rouen, Orsay
Paris-Sud 11 University
University of Paris-Sud or University of Paris-Sud or University of Paris XI is a French university distributed among several campuses in the southern suburb of Paris...
).
Architecture
Usually, GridRPC environments have five different components: clients that submit problems to servers, servers that solve the problems sent by clients, a database that contains information about software and hardware resources, a scheduler that chooses an appropriate server depending on the problem sent and the information contained in the database, and monitors that get information about the status of the computational resources.DIET's architecture follows a different design. It is composed of:
- a client - the application that uses DIET to solve problems. Clients can connect to DIET from a web page or through an API or compiled program.
- a Master Agent (MA) that receives computation requests from clients. The MA then collects computation abilities from the servers and chooses one based on scheduling criteria. The reference of the chosen server is returned to the client. A client can be connected to an MA by a specific name server or a web page that stores the various MA locations.
- a Local Agent (LA) that aims at transmitting requests and information between MAs and servers. The information stored on an LA is the list of requests and, for each of its subtrees, the number of servers that can solve a given problem and information about the data distributed in this subtree. Depending on the underlying network topology, a hierarchy of LAs may be deployed between an MA and the servers.
- a Server Daemon (SeD) that is the point of entry of a computational server. It manages a processor or a cluster. The information stored on a SeD is the list of the data available on a server (possibly with their distribution and the way to access them), the list of the problems than can be solved on it, and all the information concerning its load (e.g., CPU capacity, available memory).
Multi-hierarchy
Two approaches were developed:- a multi-MA extension was developed by the University of Franche-ComtéUniversity of Franche-ComtéThe University of Franche-Comté is a French university in the Academy of Besançon with five campuses: Besançon , Belfort , Montbéliard , Vesoul , and Lons-le-Saunier ....
. Those Master Agents are connected by a communication graph. Several DIET platforms are shared by interconnecting their respective Master Agent (MA). Clients request available SeDs from their MA as usual. If the MA finds an available SeD able to resolve the problem, it returns its reference to the client. If it does not find a SeD, it forwards the request to other MAs which can also forward it to other ones, and so on. When a MA finds a SeD which can resolve the client's request, it returns its reference to the client's MA which returns the reference to the client. The client can then use that SeD to resolve its problem. - a P2PPeer-to-peerPeer-to-peer computing or networking is a distributed application architecture that partitions tasks or workloads among peers. Peers are equally privileged, equipotent participants in the application...
Multi-MA extension called DIET_j was also designed. The aggregation of different independent DIET hierarchies (a multi-hierarchy architecture) could be managed using the P2P paradigm. This approach was based on the JXTAJXTAJXTA is an open source peer-to-peer protocol specification begun by Sun Microsystems in 2001. The JXTA protocols are defined as a set of XML messages which allow any device connected to a network to exchange messages and collaborate independently of the underlying network topology.As JXTA is based...
-J2SEJava Platform, Standard EditionJava Platform, Standard Edition or Java SE is a widely used platform for programming in the Java language. It is the Java Platform used to deploy portable applications for general use...
toolbox for the on-demand discovery and connection of MAs. This project is no longer maintained.
Workflow Management
For workflow management, DIET uses an additional entity called MA DAGDirected acyclic graph
In mathematics and computer science, a directed acyclic graph , is a directed graph with no directed cycles. That is, it is formed by a collection of vertices and directed edges, each edge connecting one vertex to another, such that there is no way to start at some vertex v and follow a sequence of...
. This entity can work in two modes: one in which it defines a complete scheduling of the workflow (ordering and mapping), and one in which it defines only an ordering for the workflow execution. Mapping is then done in the next step by the client, using the Master Agent to find the server where the workflow services should be run.
Scheduling
DIET provides a degree of control over the scheduling subsystem via plug-in schedulers. When a service request from an application arrives at a SeD, the SeD creates a performance-estimation vector, a collection of performance-estimation values that are pertinent to the scheduling process for that application. The values to be stored in this structure can be either values provided by CoRI (Collectors of Resource Information) or custom values generated by the SeD itself. The design of the estimation vector's subsystem is modular.CoRI generates a basic set of performance-estimation values which are stored in the estimation vector and identified by system-defined tags. Information such as the number of cores, the total memory, the number of bogomips, and hard drive speed, etc., which are static, as well as dynamic information like the predicted time to solve a problem on the given resource, the average CPU load, is thus transferred from the Server Daemon to the scheduler agent in order to provide pertinent information for a better scheduling. As mentioned above, these are used in correlation with the application-driven scheduler possibility in DIET: the Server Daemon, which has a better understanding of the application needs, can request for a specific scheduling relaying on the information stored in this vector.
DIET Data Management
Three different data managers have been integrated into DIET:- DTM from the University of Franche-ComtéUniversity of Franche-ComtéThe University of Franche-Comté is a French university in the Academy of Besançon with five campuses: Besançon , Belfort , Montbéliard , Vesoul , and Lons-le-Saunier ....
(not maintained); - JuxMEM from the IRISAIRISAThe Institut de recherche en informatique et systèmes aléatoires is a joint computer science research center of CNRS, University of Rennes 1, INSA and INRIA, located in Rennes in Brittany...
(not maintained); - DAGDA from École Normale Supérieure de LyonÉcole Normale Supérieure de LyonThe École Normale Supérieure de Lyon is a highly selective grande école located in Lyon, France. As one of France's three Écoles normales supérieures, ENS Lyon is associated with a strong French tradition of excellence and public service...
.
DIET LRMS Management
Parallel resources are generally accessible through a LRMS (Local Resource Management System), also called a batch system. DIET provides an interface with several existing LRMS to execute jobs: LoadLevelerLoadLeveler
LoadLeveler is a job scheduler written by IBM, to control scheduling of batch jobs. LoadLeveler matches the job requirements with the best available computer resource for execution...
on IBM
IBM
International Business Machines Corporation or IBM is an American multinational technology and consulting corporation headquartered in Armonk, New York, United States. IBM manufactures and sells computer hardware and software, and it offers infrastructure, hosting and consulting services in areas...
resources, OpenPBS which is a fork of the well-known PBS
Portable Batch System
Portable Batch System is the name of computer software that performs job scheduling. Its primary task is to allocate computational tasks, i.e., batch jobs, among the available computing resources...
system, and OAR developed by IMAG at Grenoble, and used on the Grid'5000 research grid. Most of the submitted jobs are parallel jobs, coded using the MPI standard with an instantiation such as MPICH or LAM.
Cloud-resource management
A Cloud extension for DIET was created in 2009. DIET is thus able to access Cloud resources through two existing Cloud providers:- EucalyptusEucalyptus (computing)Eucalyptus is a software platform for the implementation of private cloud computing on computer clusters. There is an open-core enterprise edition and an open-source edition. Currently, it exports a user-facing interface that is compatible with the Amazon EC2 and S3 services but the platform is...
, which is open-source software developed by the University of California, Santa BarbaraUniversity of California, Santa BarbaraThe University of California, Santa Barbara, commonly known as UCSB or UC Santa Barbara, is a public research university and one of the 10 general campuses of the University of California system. The main campus is located on a site in Goleta, California, from Santa Barbara and northwest of Los...
. - Amazon Elastic Compute CloudAmazon Elastic Compute CloudAmazon Elastic Compute Cloud is a central part of Amazon.com's cloud computing platform, Amazon Web Services . EC2 allows users to rent virtual computers on which to run their own computer applications...
, which is commercial software part of Amazon.comAmazon.comAmazon.com, Inc. is a multinational electronic commerce company headquartered in Seattle, Washington, United States. It is the world's largest online retailer. Amazon has separate websites for the following countries: United States, Canada, United Kingdom, Germany, France, Italy, Spain, Japan, and...
's cloud computing services.