DRBD
Encyclopedia
DRBD is a distributed storage system for the GNU/Linux platform. It consists of a kernel module
Loadable Kernel Module
In computing, a loadable kernel module is an object file that contains code to extend the running kernel, or so-called base kernel, of an operating system...

, several userspace management applications and some shell scripts and is normally used on high availability
High availability
High availability is a system design approach and associated service implementation that ensures a prearranged level of operational performance will be met during a contractual measurement period....

 (HA) clusters. DRBD bears similarities to RAID 1, except that it runs over a network.

DRBD refers to both the software (kernel module and associated userspace tools), and also to specific logical block devices managed by the software. DRBD device and DRBD block device are also often used for the latter.

It is free software
Free software
Free software, software libre or libre software is software that can be used, studied, and modified without restriction, and which can be copied and redistributed in modified or unmodified form either without restriction, or with restrictions that only ensure that further recipients can also do...

 released under the terms of the GNU General Public License
GNU General Public License
The GNU General Public License is the most widely used free software license, originally written by Richard Stallman for the GNU Project....

 version 2.

DRBD is part of the Lisog
Lisog
Lisog is a German open source non profit business development organization founded in 2005 in Stuttgart. Lisog has about 120 members and branch offices in Berlin, Hamburg, Vienna, Zurich, Palo Alto and Toronto. The members are providers, user companies, and scientific partners like universities.In...

 open source stack initiative.

Mode of operation

DRBD layers logical block devices (conventionally named /dev/drbdX, where X is the device minor number) over existing local block devices on participating cluster nodes. Writes to the primary node are transferred to the lower-level block device and simultaneously propagated to the secondary node. The secondary node then transfers data to its corresponding lower-level block device. All read I/O is performed locally.

Should the primary node fail, a cluster management process
Cluster manager
A Cluster manager usually is a backend GUI or command-line software that runs on one or all cluster nodes The cluster manager works together with a cluster management agent...

 promotes the secondary node to a primary state. This transition may require a subsequent verification of the integrity of the file system stacked on top of DRBD, by way of a filesystem check
Fsck
The system utility fsck is a tool for checking the consistency of a file system in Unix and Unix-like operating systems such as Linux.-Use:...

 or a journal replay. When the failed ex-primary node returns, the system may (or may not) raise it to primary level again, after device data resynchronization. DRBD's synchronization algorithm is efficient in the sense that only those blocks that were changed during the outage must be resynchronized, rather than the device in its entirety.

DRBD is often deployed together with the Heartbeat cluster manager, although it does integrate with other cluster management frameworks. It integrates with virtualization solutions such as Xen
Xen
Xen is a virtual-machine monitor providing services that allow multiple computer operating systems to execute on the same computer hardware concurrently....

, and may be used both below and on top of the Linux LVM
Logical Volume Manager (Linux)
LVM is a logical volume manager for the Linux kernel; it manages disk drives and similar mass-storage devices, in particular large ones. The term "volume" refers to a disk drive or partition thereof...

 stack.

DRBD version 8, released in January 2007, introduced support for load-balancing
Load balancing (computing)
Load balancing is a computer networking methodology to distribute workload across multiple computers or a computer cluster, network links, central processing units, disk drives, or other resources, to achieve optimal resource utilization, maximize throughput, minimize response time, and avoid...

 configurations, allowing both nodes to access a particular DRBD in read/write mode with shared storage
Computer storage
Computer data storage, often called storage or memory, refers to computer components and recording media that retain digital data. Data storage is one of the core functions and fundamental components of computers....

 semantics. Such a configuration requires the use of a distributed lock manager
Distributed lock manager
A distributed lock manager provides distributed software applications with a means to synchronize their accesses to shared resources....

.

Advantages over shared cluster storage

Conventional computer cluster systems typically use some sort of shared storage for data being used by cluster resources. This approach has a number of disadvantages, which DRBD may help offset:
  • Shared storage resources usually introduce a single point of failure
    Single point of failure
    A single point of failure is a part of a system that, if it fails, will stop the entire system from working. They are undesirable in any system with a goal of high availability or reliability, be it a business practice, software application, or other industrial system.-Overview:Systems can be made...

     in the cluster setup — while each of the cluster nodes may fail without causing service interruption, storage failure almost inevitably causes service downtime
    Downtime
    The term downtime is used to refer to periods when a system is unavailable.Downtime or outage duration refers to a period of time that a system fails to provide or perform its primary function...

    . In DRBD, no such issues exist as the cluster resource data is replicated rather than shared.
  • Shared storage resources are particularly sensitive to split brain situations, where both cluster nodes are still alive, but lose all network connectivity between them. In such a scenario, each cluster node will assume that it is the only surviving node in the cluster, and take over all cluster resources. This may lead to potentially disastrous results when both nodes, for example, mount
    Mount (computing)
    Mounting takes place before a computer can use any kind of storage device . The user or their operating system must make it accessible through the computer's file system. A user can access only files on mounted media.- Mount point :A mount point is a physical location in the partition used as a...

     and write to file systems concurrently. Cluster administrators must thus carefully implement node fencing
    Node fencing
    Node fencing is a concept in computer clusters. A node fence is a virtual "fence" that separates nodes which may have access to a shared resource from nodes which must not. It may separate an active node from its backup. If the backup crosses the fence and, for example, tries to control the...

     policies to avoid this. DRBD substantially mitigates this problem by keeping two replicated sets of data instead of one shared set.
  • Shared storage resources must typically be addressed over a SAN
    Storage area network
    A storage area network is a dedicated network that provides access to consolidated, block level data storage. SANs are primarily used to make storage devices, such as disk arrays, tape libraries, and optical jukeboxes, accessible to servers so that the devices appear like locally attached devices...

     or NAS, which creates some overhead
    Computational overhead
    In computer science, overhead is generally considered any combination of excess or indirect computation time, memory, bandwidth, or other resources that are required to attain a particular goal...

     in read I/O
    Input/output
    In computing, input/output, or I/O, refers to the communication between an information processing system , and the outside world, possibly a human, or another information processing system. Inputs are the signals or data received by the system, and outputs are the signals or data sent from it...

    . In DRBD that overhead is greatly reduced as all read operations are carried out locally.
  • Shared storage are usually expensive, consume more space (2U and more) and power. DRBD allows to create a HA setup with only 2 machines.

Applications

Operating within the Linux kernel's block layer, DRBD is essentially workload agnostic. A DRBD can be used as the basis of
  • A conventional file system
    File system
    A file system is a means to organize data expected to be retained after a program terminates by providing procedures to store, retrieve and update data, as well as manage the available space on the device which contain it. A file system organizes data in an efficient manner and is tuned to the...

     (this is the canonical example),
  • a shared disk file system such as GFS
    Global File System
    In computing, the Global File System is a shared disk file system for Linux computer clusters. This is not to be confused with the Google File System, a proprietary distributed filesystem developed by Google....

     or OCFS2,
  • another logical block device (as used in LVM
    Logical Volume Manager (Linux)
    LVM is a logical volume manager for the Linux kernel; it manages disk drives and similar mass-storage devices, in particular large ones. The term "volume" refers to a disk drive or partition thereof...

    , for example),
  • any application requiring direct access to a block device.


DRBD-based clusters are often employed for adding synchronous replication and high availability to file server
File server
In computing, a file server is a computer attached to a network that has the primary purpose of providing a location for shared disk access, i.e. shared storage of computer files that can be accessed by the workstations that are attached to the computer network...

s, relational databases (such as MySQL
MySQL
MySQL officially, but also commonly "My Sequel") is a relational database management system that runs as a server providing multi-user access to a number of databases. It is named after developer Michael Widenius' daughter, My...

), and many other workloads.

Inclusion in Linux kernel

DRBD's authors originally submitted the software to the Linux kernel community in July 2007, for possible future inclusion of DRBD into the "vanilla" (standard, without modifications) Linux kernel. After a lengthy review and several discussions, Linus Torvalds
Linus Torvalds
Linus Benedict Torvalds is a Finnish software engineer and hacker, best known for having initiated the development of the open source Linux kernel. He later became the chief architect of the Linux kernel, and now acts as the project's coordinator...

 finally agreed to have DRBD as part of the official Linux kernel. DRBD got merged on 8 December 2009 during the "merge window" for Linux kernel version 2.6.33.

See also

  • Network block device
    Network block device
    In Linux, a network block device is a device node whose content is provided by a remote machine. Typically, network block devices are used to access a storage device that does not physically reside in the local machine but on a remote one...

     (NBD) is a device node whose content is provided by a remote machine
  • Highly Available STorage
    Highly Available STorage
    Highly Available STorage is a protocol and tool set for FreeBSD written by Pawel Jakub Dawidek, a core FreeBSD developer.HAST provides a block device to be synchronized between two servers for use as a filesystem. The two machines comprise a cluster, where each machine is a cluster node...

     provides DRBD-like functionality in FreeBSD
    FreeBSD
    FreeBSD is a free Unix-like operating system descended from AT&T UNIX via BSD UNIX. Although for legal reasons FreeBSD cannot be called “UNIX”, as the direct descendant of BSD UNIX , FreeBSD’s internals and system APIs are UNIX-compliant...


External links

The source of this article is wikipedia, the free encyclopedia.  The text of this article is licensed under the GFDL.
 
x
OK