Cluster manager
Encyclopedia
A Cluster manager usually is a backend GUI or command-line software that runs on one or all cluster nodes (in some cases it runs on a different server or cluster of management servers.) The cluster manager works together with a cluster management agent. These agents run on each node of the cluster to manage and configure services, a set of services, or to manage and configure the complete cluster
server itself (see super computing.) In some cases the cluster manager is mostly used to dispatch work for the cluster (or cloud
) to perform. In this last case a subset of the cluster manager can be a remote desktop application that is used not for configuration but just to send work and get back work results from a cluster. In other cases the cluster is more related to availability
and load balancing
than to computational or specific service clusters.
Some Free and open source software
Cluster (computing)
A computer cluster is a group of linked computers, working together closely thus in many respects forming a single computer. The components of a cluster are commonly, but not always, connected to each other through fast local area networks...
server itself (see super computing.) In some cases the cluster manager is mostly used to dispatch work for the cluster (or cloud
Cloud computing
Cloud computing is the delivery of computing as a service rather than a product, whereby shared resources, software, and information are provided to computers and other devices as a utility over a network ....
) to perform. In this last case a subset of the cluster manager can be a remote desktop application that is used not for configuration but just to send work and get back work results from a cluster. In other cases the cluster is more related to availability
High availability
High availability is a system design approach and associated service implementation that ensures a prearranged level of operational performance will be met during a contractual measurement period....
and load balancing
Load balancing
Load balancing or load distribution may refer to:*Load balancing , balancing a workload amongst multiple computer devices*Load balancing , the storing of excess electrical power by power stations during low demand periods, for release as demand rises*Weight distribution, the apportioning of weight...
than to computational or specific service clusters.
Some Free and open source softwareFree and open source softwareFree and open-source software or free/libre/open-source software is software that is liberally licensed to grant users the right to use, study, change, and improve its design through the availability of its source code...
solutions
- HeartbeatLinux-HAThe Linux-HA project provides a high-availability solution for Linux, FreeBSD, OpenBSD, Solaris and Mac OS X which promotes reliability, availability, and serviceability ....
, from Linux-HALinux-HAThe Linux-HA project provides a high-availability solution for Linux, FreeBSD, OpenBSD, Solaris and Mac OS X which promotes reliability, availability, and serviceability .... - IPVSIP Virtual ServerIPVS implements transport-layer load balancing inside the Linux kernel, so called Layer-4 LAN switching. IPVS is incorporated into Linux Virtual Server, where it runs on a host and acts as a load balancer at the front of a cluster of real servers, it can direct requests for TCP/UDP based services...
, from Linux Virtual ServerLinux Virtual ServerLinux Virtual Server is an advanced load balancing solution for Linux systems. It is an open source project started by Wensong Zhang in May 1998... - Keepalived, keepalived.sourceforge.net
- mysqlBindMysqlBindmysqlBind/unxsBind is a DNS management software system. It supports ISC BIND DNS and is distributed as open source software under the GNU General Public License.mysqlBind/unxsBind has been in use since the late 1990s...
, from unixservice.com - NimbusNimbus (cloud computing)Nimbus is an open-source toolkit that, once installed on a cluster, provides an infrastructure as a Service cloud to its client via WSRF-based or Amazon EC2 WSDL web service APIs....
, from Linux Labs - oneSISOnesisoneSIS is an open-sourced software tool developed at Sandia National Laboratories aimed at easing systems administration in large-scale, Linux cluster environments....
, from onesis.org - Piranha, from Red HatRed HatRed Hat, Inc. is an S&P 500 company in the free and open source software sector, and a major Linux distribution vendor. Founded in 1993, Red Hat has its corporate headquarters in Raleigh, North Carolina with satellite offices worldwide....
- Project KusuProject KusuProject Kusu is the open source base that was designed from ground up for the basis of Platform Computing's Open Cluster Stack 5.Project Kusu is built with the objective of being a simplified cluster management, operation and deployment source kit, that supports a range of different linux...
, from Platform ComputingPlatform ComputingPlatform Computing is a privately held software company that is primarily known for its job scheduling product, Load Sharing Facility . It was founded in 1992 in Toronto, Ontario, Canada and is currently headquartered in Markham, Ontario with 11 branch offices across the United States, Europe and... - Ultra Monkey, from www.ultramonkey.org
- Surealived surealived.sf.net
Some Proprietary solutions
- Cluster ServerMicrosoft Cluster ServerMicrosoft Cluster Server is software designed to allow servers to work together as a computer cluster, to provide failover and increased availability of applications, or parallel calculating power in case of high-performance computing clusters .Microsoft has three technologies for clustering:...
, from MicrosoftMicrosoftMicrosoft Corporation is an American public multinational corporation headquartered in Redmond, Washington, USA that develops, manufactures, licenses, and supports a wide range of products and services predominantly related to computing through its various product divisions... - Bright Cluster Manager, from Bright Computing
- IBM Tivoli System Automation for Multiplatforms, from IBMIBMInternational Business Machines Corporation or IBM is an American multinational technology and consulting corporation headquartered in Armonk, New York, United States. IBM manufactures and sells computer hardware and software, and it offers infrastructure, hosting and consulting services in areas...
Cluster Management Papers
- Adaptive Control of Extreme-scale Stream Processing Systems Proceedings of the 26th IEEE International Conference on Distributed Computing Systems.
- Design, implementation, and evaluation of the linear road benchmark on the stream processing core Proceedings of the 2006 ACM SIGMOD international conference on Management of data.
- Parallel Job Scheduling A Status Report (2004) 10th Workshop on Job Scheduling Strategies for Parallel Processing, New-York, NY, June 2004.
- Condor-G: A Computation Management Agent for Multi-Institutional Grids Springer Journal Cluster Computing Volume 5, Number 3 / July, 2002
- From clusters to the fabric: the job management perspective Cluster Computing, 2003. Proceedings. 2003 IEEE International Conference on
- An Overview of the Galaxy Management Framework for Scalable Enterprise Cluster Computing IEEE International Conference on Cluster Computing (Cluster'00), 2000.
- Performance and Interoperability Issues in Incorporating Cluster Management Systems within a Wide-Area Network-Computing Environment ACM/IEEE Supercomputing 2000: High Performance Networking and Computing.
- DIRAC: a scalable lightweight architecture for high throughput computing Grid Computing, 2004. Proceedings. Fifth IEEE/ACM International Workshop on
- AgentTeamwork: Coordinating grid-computing jobs with mobile agents Springer Journal Applied Intelligence Volume 25, Number 2 / October, 2006
- Mesos: A Platform for Fine-Grained Resource Sharing in the Data Center UC Berkeley Tech Report, May, 2010
Autonomic Computing
- The Laundromat Model for Autonomic Cluster Computing Autonomic Computing, 2006. ICAC '06. IEEE International Conference on.
- Distributed Stream Management using Utility-Driven Self-Adaptive Middleware Proceedings of the Second International Conference on Automatic Computing (2005).
Fault Tolerance
- Fault-tolerance in the Borealis distributed stream processing system Proceedings of the 2005 ACM SIGMOD international conference on Management of data.
- A Global-State-Triggered Fault Injector for Distributed System Evaluation IEEE Transactions On Parallel And Distributed Systems / July, 2004
- Job-Site Level Fault Tolerance for Cluster and Grid environments IEEE International Conference on Cluster Computing (Cluster 2005)
- Fault Injection in Distributed Java Applications Parallel and Distributed Processing Symposium, 2006. IPDPS 2006. 20th International
- Load balancing and fault tolerance in workstation clusters migrating groups of communicating processes ACM SIGOPS Operating Systems Review, October 1995.