OpenSSI
Encyclopedia
OpenSSI is an open source
single-system image
clustering
system. It allows a collection of computers to be treated as one large system, allowing applications running on any one machine access to the resources of all the machines in the cluster.
OpenSSI is based on the Linux
operating system
and was released as an open source project by Compaq
in 2001.
It is the final stage of a long process of development, stretching back to LOCUS
, developed in the early 1980s.
OpenSSI is designed to be used for both high performance
and high availability
clusters, it is possible to create an OpenSSI cluster with no single point of failure, for example the file system can be mirrored between two nodes, so if one node crashes the process accessing the file will fail over to the other node. Alternatively the cluster can be designed in such a manner that every node has direct access to the file system.
The implementation of the single process space is accomplished using the VPROC abstraction invented by Locus
for the OSF/1 AD operating system.
Processes can be manually migrated, either by the process calling the special OpenSSI migrate(2) system call, or by writing a node number to a special file in the processes /proc directory.
Processes may also, if the user wants, be automatically migrated in order to balance load across the cluster. OpenSSI uses an algorithm developed by the MOSIX
project for determining the load on each node.
OpenSSI uses the context dependent symbolic link (CDSL) feature, inspired by HP's TruCluster
system, to allow access to node-specific files in a manner transparent to non cluster-aware application
s. A CDSL may point to different files on each node in the cluster.
CFS is stacked on top of the real file system and co-ordinates accesses from different nodes using a token mechanism. One node has physical access to the underlying file system and performs all read and write operations. At any one time one node owns a token, representing a part of the underlying file, this implies that that part of the file is in the cache of the owning node. If another node tries to access that part of the file the token is stolen and the cache contents are copied to the stealing node. The OpenSSI CFS implementation is remarkably similar to that used by HP TruCluster
.
CFS is also used to co-ordinate access to shared memory segments.
CFS can be used in a fault tolerant system by using shared disk subsystems (dual ported SCSI
or SAN
), or by using DRBD
. If the node that is currently directly accessing the file system crashes then the CFS mount fails over to the other node that is directly connected to the disk and the cluster now accesses the file system via that node.
based clustered file system
s for its root provided they provide a POSIX
compatible file system interface. Currently Lustre
and GFS
have been tested.
With a clustered file system, each node mounts the file system in parallel and access to the files goes directly from the node to the file system.
The udev
device manager is used to manage the /dev directory. Each node runs its own copy of udev to create the appropriate device nodes in a subdirectory of /dev, /dev/1 for node 1, /dev/2 for node 2 and so on.
, semaphore
s, SYSV message queues, pipe
s and Unix domain socket
s.
In order to implement cluster wide shared memory - distributed shared memory
- OpenSSI uses the CFS token system. At any one time a memory segment may be readable by one or more nodes, or writable by one node. If a node without write access to a segment tries to write then the segment is marked unreadable on all other nodes and writable on the current node. If a node without read access tries to read a segment then the current value is copied from a node where it was valid and if it was writable it is marked readable.
to provide fault-tolerant load balanced IP
services. Inbound network connections are received by a director node which redirects them to the least loaded server node. (A node may be both a director and server). In the event of director node failure another director node takes over and the system continues to accept inbound connections.
s. The OpenSSI kernel
is distribution independent but various distribution specific Linux user level systems need to be modified, for example the init
process and the system startup
scripts.
Currently the supported distributions are:
Work is in progress to port OpenSSI to Debian Etch and Lenny
distributed operating system
was developed at UCLA. The team that developed LOCUS went on to form the Locus Computing Corporation
and produced various versions of the LOCUS technology under several names, culminating in the development of the UnixWare NonStop Clusters
product at Tandem Computers
, which had by that time acquired the LOCUS team and rights to the technology. NonStop Clusters for Unixware was commercialized by SCO
as an add-on for UnixWare. When SCO
stopped selling NonStop Clusters, the former Locus team, now working for Compaq
(which had acquired Tandem in the interim), ported the NonStop Clusters code to Linux
and released it as open source. The team at Compaq continued to develop the system, now called OpenSSI, for some time after HP
acquired Compaq. OpenSSI is currently developed by an independent team.
Open source
The term open source describes practices in production and development that promote access to the end product's source materials. Some consider open source a philosophy, others consider it a pragmatic methodology...
single-system image
Single-system image
In distributed computing, a single system image cluster is a cluster of machines that appears to be one single system. The concept is often considered synonymous with that of a distributed operating system, but a single image may be presented for more limited purposes, just job scheduling for...
clustering
Cluster (computing)
A computer cluster is a group of linked computers, working together closely thus in many respects forming a single computer. The components of a cluster are commonly, but not always, connected to each other through fast local area networks...
system. It allows a collection of computers to be treated as one large system, allowing applications running on any one machine access to the resources of all the machines in the cluster.
OpenSSI is based on the Linux
Linux
Linux is a Unix-like computer operating system assembled under the model of free and open source software development and distribution. The defining component of any Linux system is the Linux kernel, an operating system kernel first released October 5, 1991 by Linus Torvalds...
operating system
Operating system
An operating system is a set of programs that manage computer hardware resources and provide common services for application software. The operating system is the most important type of system software in a computer system...
and was released as an open source project by Compaq
Compaq
Compaq Computer Corporation is a personal computer company founded in 1982. Once the largest supplier of personal computing systems in the world, Compaq existed as an independent corporation until 2002, when it was acquired for US$25 billion by Hewlett-Packard....
in 2001.
It is the final stage of a long process of development, stretching back to LOCUS
LOCUS (operating system)
LOCUS was a distributed operating system developed at UCLA during the 1980s. It was notable for providing an early implementation of the single-system image idea, where a cluster of machines appeared to be one larger machine....
, developed in the early 1980s.
Description
OpenSSI allows a cluster of individual computers (nodes) to be treated as one large system. Processes run on any node have full access to the resources of all nodes. Processes can be migrated from node to node automatically to balance system utilization. Inbound network connections can be directed to the least loaded node available.OpenSSI is designed to be used for both high performance
High-performance computing
High-performance computing uses supercomputers and computer clusters to solve advanced computation problems. Today, computer systems approaching the teraflops-region are counted as HPC-computers.-Overview:...
and high availability
High availability
High availability is a system design approach and associated service implementation that ensures a prearranged level of operational performance will be met during a contractual measurement period....
clusters, it is possible to create an OpenSSI cluster with no single point of failure, for example the file system can be mirrored between two nodes, so if one node crashes the process accessing the file will fail over to the other node. Alternatively the cluster can be designed in such a manner that every node has direct access to the file system.
Single Process space
OpenSSI provides a single process space - every process is visible from every node, and can be managed from any node using the normal Linux commands (ps, kill, renice and so on). The Linux /proc virtual filesystem shows all running processes on all nodes.The implementation of the single process space is accomplished using the VPROC abstraction invented by Locus
Locus Computing Corporation
Locus Computing Corporation was formed in 1982 by Gerald J. Popekto commercialize the technologies developed for the LOCUS distributed operating system at UCLA...
for the OSF/1 AD operating system.
Migration
OpenSSI allows migration of running processes between nodes. When running processes are migrated they continue to have access to any open files, IPC objects or network connections.Processes can be manually migrated, either by the process calling the special OpenSSI migrate(2) system call, or by writing a node number to a special file in the processes /proc directory.
Processes may also, if the user wants, be automatically migrated in order to balance load across the cluster. OpenSSI uses an algorithm developed by the MOSIX
MOSIX
MOSIX is a distributed operating system. Although early versions were based on older UNIX systems, since 1999 it focuses on Linux clusters and grids...
project for determining the load on each node.
Single root
OpenSSI provides a single root for the cluster - from any node the same files and directories are available. OpenSSI uses several mechanisms to provide the single root - CFS (the OpenSSI Cluster File System), SAN cluster filesystems and parallel mounts of network file systems.OpenSSI uses the context dependent symbolic link (CDSL) feature, inspired by HP's TruCluster
TruCluster
TruCluster is a closed-source high-availability clustering solution for the Tru64 UNIX operating system. It was originally developed by Digital Equipment Corporation , but was transferred to Compaq in 1998 when Digital was acquired by the company, which then later merged with Hewlett-Packard ....
system, to allow access to node-specific files in a manner transparent to non cluster-aware application
Cluster-aware application
A cluster-aware application is a software application designed to call cluster APIs in order to determine its running state, in case a manual failover is triggered between cluster nodes for planned technical maintenance, or an automatic failover is required, if a computing cluster node encounters...
s. A CDSL may point to different files on each node in the cluster.
CFS
CFS, the OpenSSI Cluster File System provides transparent inter-node access to an underlying real file system on one node.CFS is stacked on top of the real file system and co-ordinates accesses from different nodes using a token mechanism. One node has physical access to the underlying file system and performs all read and write operations. At any one time one node owns a token, representing a part of the underlying file, this implies that that part of the file is in the cache of the owning node. If another node tries to access that part of the file the token is stolen and the cache contents are copied to the stealing node. The OpenSSI CFS implementation is remarkably similar to that used by HP TruCluster
TruCluster
TruCluster is a closed-source high-availability clustering solution for the Tru64 UNIX operating system. It was originally developed by Digital Equipment Corporation , but was transferred to Compaq in 1998 when Digital was acquired by the company, which then later merged with Hewlett-Packard ....
.
CFS is also used to co-ordinate access to shared memory segments.
CFS can be used in a fault tolerant system by using shared disk subsystems (dual ported SCSI
SCSI
Small Computer System Interface is a set of standards for physically connecting and transferring data between computers and peripheral devices. The SCSI standards define commands, protocols, and electrical and optical interfaces. SCSI is most commonly used for hard disks and tape drives, but it...
or SAN
Storage area network
A storage area network is a dedicated network that provides access to consolidated, block level data storage. SANs are primarily used to make storage devices, such as disk arrays, tape libraries, and optical jukeboxes, accessible to servers so that the devices appear like locally attached devices...
), or by using DRBD
DRBD
DRBD is a distributed storage system for the GNU/Linux platform. It consists of a kernel module, several userspace management applications and some shell scripts and is normally used on high availability clusters...
. If the node that is currently directly accessing the file system crashes then the CFS mount fails over to the other node that is directly connected to the disk and the cluster now accesses the file system via that node.
SAN clustered file systems
OpenSSI can use SANStorage area network
A storage area network is a dedicated network that provides access to consolidated, block level data storage. SANs are primarily used to make storage devices, such as disk arrays, tape libraries, and optical jukeboxes, accessible to servers so that the devices appear like locally attached devices...
based clustered file system
Clustered file system
A clustered file system is a file system which is shared by being simultaneously mounted on multiple servers. There are several approaches to clustering, most of which do not employ a clustered file system...
s for its root provided they provide a POSIX
POSIX
POSIX , an acronym for "Portable Operating System Interface", is a family of standards specified by the IEEE for maintaining compatibility between operating systems...
compatible file system interface. Currently Lustre
Lustre (file system)
Lustre is a massively parallel distributed file system, generally used for large scale cluster computing. The name Lustre is a portmanteau word derived from Linux and cluster...
and GFS
Global File System
In computing, the Global File System is a shared disk file system for Linux computer clusters. This is not to be confused with the Google File System, a proprietary distributed filesystem developed by Google....
have been tested.
With a clustered file system, each node mounts the file system in parallel and access to the files goes directly from the node to the file system.
NFS
OpenSSI mounts NFS files systems in parallel on each node. Every node accesses the NFS server directly.Single I/O space
OpenSSI provides cluster-wide access to all I/O devices on the system, with some limitations - it is not possible for a node to mount a block device from another node.The udev
Udev
udev is the device manager for the Linux kernel. Primarily, it manages device nodes in /dev. It is the successor of devfs and hotplug, which means that it handles the /dev directory and all user space actions when adding/removing devices, including firmware load.-History:udev was new in Linux...
device manager is used to manage the /dev directory. Each node runs its own copy of udev to create the appropriate device nodes in a subdirectory of /dev, /dev/1 for node 1, /dev/2 for node 2 and so on.
Single IPC space
OpenSSI provides internode access to all the standard Linux inter-process communication mechanisms, shared memoryShared memory
In computing, shared memory is memory that may be simultaneously accessed by multiple programs with an intent to provide communication among them or avoid redundant copies. Depending on context, programs may run on a single processor or on multiple separate processors...
, semaphore
Semaphore (programming)
In computer science, a semaphore is a variable or abstract data type that provides a simple but useful abstraction for controlling access by multiple processes to a common resource in a parallel programming environment....
s, SYSV message queues, pipe
Named pipe
In computing, a named pipe is an extension to the traditional pipe concept on Unix and Unix-like systems, and is one of the methods of inter-process communication. The concept is also found in Microsoft Windows, although the semantics differ substantially...
s and Unix domain socket
Unix domain socket
A Unix domain socket or IPC socket is a data communications endpoint for exchanging data between processes executing within the same host operating system. While similar in functionality to...
s.
In order to implement cluster wide shared memory - distributed shared memory
Distributed shared memory
Distributed Shared Memory , in Computer Architecture is a form of memory architecture where the memories can be addressed as one address space...
- OpenSSI uses the CFS token system. At any one time a memory segment may be readable by one or more nodes, or writable by one node. If a node without write access to a segment tries to write then the segment is marked unreadable on all other nodes and writable on the current node. If a node without read access tries to read a segment then the current value is copied from a node where it was valid and if it was writable it is marked readable.
Cluster IP address
OpenSSI uses LVSLinux Virtual Server
Linux Virtual Server is an advanced load balancing solution for Linux systems. It is an open source project started by Wensong Zhang in May 1998...
to provide fault-tolerant load balanced IP
Internet Protocol
The Internet Protocol is the principal communications protocol used for relaying datagrams across an internetwork using the Internet Protocol Suite...
services. Inbound network connections are received by a director node which redirects them to the least loaded server node. (A node may be both a director and server). In the event of director node failure another director node takes over and the system continues to accept inbound connections.
Distributions
The OpenSSI software is available for various Linux distributionLinux distribution
A Linux distribution is a member of the family of Unix-like operating systems built on top of the Linux kernel. Such distributions are operating systems including a large collection of software applications such as word processors, spreadsheets, media players, and database applications...
s. The OpenSSI kernel
Linux kernel
The Linux kernel is an operating system kernel used by the Linux family of Unix-like operating systems. It is one of the most prominent examples of free and open source software....
is distribution independent but various distribution specific Linux user level systems need to be modified, for example the init
Init
init is a program for Unix-based computer operating systems that spawns all other processes. It runs as a daemon and typically has PID 1. The boot loader starts the kernel and the kernel starts init...
process and the system startup
Runlevel
The term runlevel refers to a mode of operation in one of the computer operating systems that implement Unix System V-style initialization. Conventionally, seven runlevels exist, numbered from zero to six; though up to ten, from zero to nine, may be used. S is sometimes used as a synonym for one...
scripts.
Currently the supported distributions are:
- FedoraFedora (operating system)Fedora is a RPM-based, general purpose collection of software, including an operating system based on the Linux kernel, developed by the community-supported Fedora Project and sponsored by Red Hat...
Core 3 - DebianDebianDebian is a computer operating system composed of software packages released as free and open source software primarily under the GNU General Public License along with other free software licenses. Debian GNU/Linux, which includes the GNU OS tools and Linux kernel, is a popular and influential...
Sarge
Work is in progress to port OpenSSI to Debian Etch and Lenny
History
The origins of OpenSSI date back to the early 1980s when the LOCUSLOCUS (operating system)
LOCUS was a distributed operating system developed at UCLA during the 1980s. It was notable for providing an early implementation of the single-system image idea, where a cluster of machines appeared to be one larger machine....
distributed operating system
Distributed operating system
A distributed operating system is the logical aggregation of operating system software over a collection of independent, networked, communicating, and spatially disseminated computational nodes. Individual system nodes each hold a discrete software subset of the global aggregate operating system...
was developed at UCLA. The team that developed LOCUS went on to form the Locus Computing Corporation
Locus Computing Corporation
Locus Computing Corporation was formed in 1982 by Gerald J. Popekto commercialize the technologies developed for the LOCUS distributed operating system at UCLA...
and produced various versions of the LOCUS technology under several names, culminating in the development of the UnixWare NonStop Clusters
UnixWare NonStop Clusters
NonStop Clusters was an add-on package for SCO UnixWare that allowed creation of fault-tolerant single-system image clusters of machines running UnixWare...
product at Tandem Computers
Tandem Computers
Tandem Computers, Inc. was the dominant manufacturer of fault-tolerant computer systems for ATM networks, banks, stock exchanges, telephone switching centers, and other similar commercial transaction processing applications requiring maximum uptime and zero data loss. The company was founded in...
, which had by that time acquired the LOCUS team and rights to the technology. NonStop Clusters for Unixware was commercialized by SCO
SCO
-Codes:* .sco, the proposed national internet TLD for Scotland* Country code for Scotland* Scots language * Aktau Airport in Kazakhstan, IATA code.-Companies:* Santa Cruz Operation, a company founded in 1979...
as an add-on for UnixWare. When SCO
SCO
-Codes:* .sco, the proposed national internet TLD for Scotland* Country code for Scotland* Scots language * Aktau Airport in Kazakhstan, IATA code.-Companies:* Santa Cruz Operation, a company founded in 1979...
stopped selling NonStop Clusters, the former Locus team, now working for Compaq
Compaq
Compaq Computer Corporation is a personal computer company founded in 1982. Once the largest supplier of personal computing systems in the world, Compaq existed as an independent corporation until 2002, when it was acquired for US$25 billion by Hewlett-Packard....
(which had acquired Tandem in the interim), ported the NonStop Clusters code to Linux
Linux
Linux is a Unix-like computer operating system assembled under the model of free and open source software development and distribution. The defining component of any Linux system is the Linux kernel, an operating system kernel first released October 5, 1991 by Linus Torvalds...
and released it as open source. The team at Compaq continued to develop the system, now called OpenSSI, for some time after HP
Hewlett-Packard
Hewlett-Packard Company or HP is an American multinational information technology corporation headquartered in Palo Alto, California, USA that provides products, technologies, softwares, solutions and services to consumers, small- and medium-sized businesses and large enterprises, including...
acquired Compaq. OpenSSI is currently developed by an independent team.