RAID 5 write hole
Encyclopedia
The RAID 5 write hole is a known data corruption issue with the redundant hard disk
setup known as RAID 5, caused by non-atomic writes.
The disk is being written to when an adverse situation happens, such as a power outage or sudden disk failure. Suppose there are outstanding writes to the areas of disk marked with an asterisk (*) above. The RAID 5 protocol stipulates that a write must happen at the same time as a parity update. Thus the following changes would be made:
However the writing process is interrupted, so the parity is incorrect. This may only occur on one stripe, or multiple stripes, depending on the implementation of the RAID driver and underlying hardware (due to out-of-order caching, etc.).:
The error may remain undetected indefinitely, because of application-layer redundancy or other practices. However, the main issue happens when a disk in the RAID fails. As can be seen, if any disk fails, the RAID will be rebuilt with incorrect information due to the incorrect parity. As can also be seen, the flipped bits can be anywhere on the virtual RAID device: it might be in a completely unrelated file, in the filesystem metadata, etc. This is known as the "RAID 5 write-hole".
Hard disk
A hard disk drive is a non-volatile, random access digital magnetic data storage device. It features rotating rigid platters on a motor-driven spindle within a protective enclosure. Data is magnetically read from and written to the platter by read/write heads that float on a film of air above the...
setup known as RAID 5, caused by non-atomic writes.
Issue
Imagine the following configuration with 2 disks and a parity disk in a RAID 5 configuration:Device 1 | Device 2 | Parity device |
---|---|---|
1 | 0* | 1(stripe sum is odd) |
0* | 0 | 0(stripe sum is even) |
1 | 1 | 0(stripe sum is even) |
The disk is being written to when an adverse situation happens, such as a power outage or sudden disk failure. Suppose there are outstanding writes to the areas of disk marked with an asterisk (*) above. The RAID 5 protocol stipulates that a write must happen at the same time as a parity update. Thus the following changes would be made:
Device 1 | Device 2 | Parity device |
---|---|---|
1 | |
|
|
0 | |
1 | 1 | 0(stripe sum is even) |
However the writing process is interrupted, so the parity is incorrect. This may only occur on one stripe, or multiple stripes, depending on the implementation of the RAID driver and underlying hardware (due to out-of-order caching, etc.).:
Device 1 | Device 2 | Parity device |
---|---|---|
1 | 1 | 1(stripe sum is odd - WRONG) |
0 | 0 | 1(stripe sum is odd - WRONG) |
1 | 1 | 0(stripe sum is even) |
The error may remain undetected indefinitely, because of application-layer redundancy or other practices. However, the main issue happens when a disk in the RAID fails. As can be seen, if any disk fails, the RAID will be rebuilt with incorrect information due to the incorrect parity. As can also be seen, the flipped bits can be anywhere on the virtual RAID device: it might be in a completely unrelated file, in the filesystem metadata, etc. This is known as the "RAID 5 write-hole".
Potential causes
- Loss of power: the disk is unable to completely write and update parity atomically, due to loss of power.
- Disk failure: the disk is unable to completely write and update parity atomically, even though it has power.
Mitigation
- A battery-backed cache on a dedicated hardware RAID controller, and battery-backed hard disks (perhaps via the RAID controller), can mitigate issues with loss of power, by allowing the disk to finish writing. This works because parity issues are not at the OSOSOS may refer to:* O.S. Old Stonyhurst, an old boy of the ancient Jesuit public school, Stonyhurst College* O.S. Engines, a Japanese manufacturer of model aircraft engines* Ocean Science, an Oceanographic Journal published by the European Geosciences Union....
level with a hardware RAID controller. This can be significantly expensive however. - An uninterruptible power supplyUninterruptible power supplyAn uninterruptible power supply, also uninterruptible power source, UPS or battery/flywheel backup, is an electrical apparatus that provides emergency power to a load when the input power source, typically mains power, fails...
for the entire computer will also prevent issues with loss of power. However, the computer must detect when it is on battery power (by interfacing with the UPS which must have such a feature) and shut down before the power is lost. This can also be expensive. - A filesystem which is 100% resistant to bit-flips can prevent data corruption (if the write-hole produces sufficiently random bitflip errors).
- Using a more correct type of RAID can avoid this problem.