How to add Hot Spare Volume to the existing mdadm software RAID array
Having a Hot Spare significantly increases the security of our data in a RAID array. In the case of a single disk failure, a Hot Spare jumps in the place of a faulty drive. Working as a temporary replacement, a Hot Spare can potentially buy time for us before we swap the faulty drive with the new one.
In the below scenario we have an example CentOS 7, installed on top of RAID 1 (mirror) using mdadm software RAID. The array was created by the Anaconda installer script during OS installation, for more details refer to:
CentOS 7 Installation with LVM RAID 1 – Mirroring.
The array consists of two physical disks: /dev/sda and /dev/sdb and two personalities: /dev/md127 (for standard /boot partition) and /dev/md126 (for LVM Physical Volume):
[root@compute ~]# cat /proc/mdstat Personalities : [raid1] md126 : active raid1 sda2[0] sdb2[1] 142191616 blocks super 1.2 [2/2] [UU] bitmap: 1/2 pages [4KB], 65536KB chunk md127 : active raid1 sda1[0] sdb1[1] 1047552 blocks super 1.2 [2/2] [UU] bitmap: 0/1 pages [0KB], 65536KB chunk unused devices:
[root@compute ~]# lsblk NAME MAJ:MIN RM SIZE RO TYPE MOUNTPOINT sda 8:0 0 136.8G 0 disk ├─sda1 8:1 0 1G 0 part │ └─md127 9:127 0 1023M 0 raid1 /boot └─sda2 8:2 0 135.7G 0 part └─md126 9:126 0 135.6G 0 raid1 ├─centos-root 253:0 0 50G 0 lvm / └─centos-swap 253:1 0 16G 0 lvm [SWAP] sdb 8:16 0 136.8G 0 disk ├─sdb1 8:17 0 1G 0 part │ └─md127 9:127 0 1023M 0 raid1 /boot └─sdb2 8:18 0 135.7G 0 part └─md126 9:126 0 135.6G 0 raid1 ├─centos-root 253:0 0 50G 0 lvm / └─centos-swap 253:1 0 16G 0 lvm [SWAP] sdc 8:32 0 136.8G 0 disk sdd 8:48 0 232.9G 0 disk
Now we want to add /dev/sdc as a Hot Spare volume.
1. Copy partition table from the active physical volume in RAID array (i.e. /dev/sda) to the spare volume
[root@compute ~]# sfdisk -d /dev/sda | sfdisk /dev/sdc
Spare volume /dev/sdc should have now the same partition table as /dev/sda or /dev/sdb:
[root@compute ~]# lsblk NAME MAJ:MIN RM SIZE RO TYPE MOUNTPOINT sda 8:0 0 136.8G 0 disk ├─sda1 8:1 0 1G 0 part │ └─md127 9:127 0 1023M 0 raid1 /boot └─sda2 8:2 0 135.7G 0 part └─md126 9:126 0 135.6G 0 raid1 ├─centos-root 253:0 0 50G 0 lvm / └─centos-swap 253:1 0 16G 0 lvm [SWAP] sdb 8:16 0 136.8G 0 disk ├─sdb1 8:17 0 1G 0 part │ └─md127 9:127 0 1023M 0 raid1 /boot └─sdb2 8:18 0 135.7G 0 part └─md126 9:126 0 135.6G 0 raid1 ├─centos-root 253:0 0 50G 0 lvm / └─centos-swap 253:1 0 16G 0 lvm [SWAP] sdc 8:32 0 136.8G 0 disk ├─sdc1 8:33 0 1G 0 part └─sdc2 8:34 0 135.7G 0 part sdd 8:48 0 232.9G 0 disk
2. Add suitable spare volume partitions /dev/sdc1 and /dev/sdc2 to the corresponding personalities /dev/md127 and /dev/md126
[root@compute ~]# mdadm --add /dev/md127 /dev/sdc1 mdadm: added /dev/sdc1
[root@compute ~]# mdadm --add /dev/md126 /dev/sdc2 mdadm: added /dev/sdc2
3. Verify the configuration
Hot Spare volume partitions /dev/sdc1 and /dev/sdc2 are now present in corresponding personalities with the (S) mark:
[root@compute ~]# cat /proc/mdstat Personalities : [raid1] md126 : active raid1 sdc2[2](S) sda2[0] sdb2[1] 142191616 blocks super 1.2 [2/2] [UU] bitmap: 0/2 pages [0KB], 65536KB chunk md127 : active raid1 sdc1[2](S) sda1[0] sdb1[1] 1047552 blocks super 1.2 [2/2] [UU] bitmap: 0/1 pages [0KB], 65536KB chunk unused devices:
Spare volumes can also be verified using mdadm command:
[root@compute ~]# mdadm --detail /dev/md127 | head -n 20 /dev/md127: Version : 1.2 Creation Time : Mon Dec 28 23:04:15 2020 Raid Level : raid1 Array Size : 1047552 (1023.00 MiB 1072.69 MB) Used Dev Size : 1047552 (1023.00 MiB 1072.69 MB) Raid Devices : 2 Total Devices : 3 Persistence : Superblock is persistent Intent Bitmap : Internal Update Time : Tue Dec 29 19:30:42 2020 State : clean Active Devices : 2 Working Devices : 3 Failed Devices : 0 Spare Devices : 1 Consistency Policy : bitmap
[root@compute ~]# mdadm --detail /dev/md126 | head -n 20 /dev/md126: Version : 1.2 Creation Time : Mon Dec 28 23:04:23 2020 Raid Level : raid1 Array Size : 142191616 (135.60 GiB 145.60 GB) Used Dev Size : 142191616 (135.60 GiB 145.60 GB) Raid Devices : 2 Total Devices : 3 Persistence : Superblock is persistent Intent Bitmap : Internal Update Time : Tue Dec 29 23:10:20 2020 State : clean Active Devices : 2 Working Devices : 3 Failed Devices : 0 Spare Devices : 1 Consistency Policy : bitmap
RAID
Redundant Array of Independent Disks (RAID) is a storage technology that combines multiple disk drive components (typically disk drives or partitions thereof) into a logical unit. Depending on the RAID implementation, this logical unit can be a file system or an additional transparent layer that can hold several partitions. Data is distributed across the drives in one of several ways called #RAID levels, depending on the level of redundancy and performance required. The RAID level chosen can thus prevent data loss in the event of a hard disk failure, increase performance or be a combination of both.
This article explains how to create/manage a software RAID array using mdadm.
RAID levels
Despite redundancy implied by most RAID levels, RAID does not guarantee that data is safe. A RAID will not protect data if there is a fire, the computer is stolen or multiple hard drives fail at once. Furthermore, installing a system with RAID is a complex process that may destroy data.
Standard RAID levels
There are many different levels of RAID; listed below are the most common.
RAID 0 Uses striping to combine disks. Even though it does not provide redundancy, it is still considered RAID. It does, however, provide a big speed benefit. If the speed increase is worth the possibility of data loss (for swap partition for example), choose this RAID level. On a server, RAID 1 and RAID 5 arrays are more appropriate. The size of a RAID 0 array block device is the size of the smallest component partition times the number of component partitions. RAID 1 The most straightforward RAID level: straight mirroring. As with other RAID levels, it only makes sense if the partitions are on different physical disk drives. If one of those drives fails, the block device provided by the RAID array will continue to function as normal. The example will be using RAID 1 for everything except swap and temporary data. Please note that with a software implementation, the RAID 1 level is the only option for the boot partition, because bootloaders reading the boot partition do not understand RAID, but a RAID 1 component partition can be read as a normal partition. The size of a RAID 1 array block device is the size of the smallest component partition. RAID 5 Requires 3 or more physical drives, and provides the redundancy of RAID 1 combined with the speed and size benefits of RAID 0. RAID 5 uses striping, like RAID 0, but also stores parity blocks distributed across each member disk. In the event of a failed disk, these parity blocks are used to reconstruct the data on a replacement disk. RAID 5 can withstand the loss of one member disk.
Note: RAID 5 is a common choice due to its combination of speed and data redundancy. The caveat is that if one drive were to fail and another drive failed before that drive was replaced, all data will be lost. Furthermore, with modern disk sizes and expected unrecoverable read error (URE) rates on consumer disks, the rebuild of a 4TiB array is expected (i.e. higher than 50% chance) to have at least one URE. Because of this, RAID 5 is no longer advised by the storage industry.
RAID 6 Requires 4 or more physical drives, and provides the benefits of RAID 5 but with security against two drive failures. RAID 6 also uses striping, like RAID 5, but stores two distinct parity blocks distributed across each member disk. In the event of a failed disk, these parity blocks are used to reconstruct the data on a replacement disk. RAID 6 can withstand the loss of two member disks. The robustness against unrecoverable read errors is somewhat better, because the array still has parity blocks when rebuilding from a single failed drive. However, given the overhead, RAID 6 is costly and in most settings RAID 10 in far2 layout (see below) provides better speed benefits and robustness, and is therefore preferred.
Nested RAID levels
RAID 1+0 RAID1+0 is a nested RAID that combines two of the standard levels of RAID to gain performance and additional redundancy. It is commonly referred to as RAID10, however, Linux MD RAID10 is slightly different from simple RAID layering, see below. RAID 10 RAID10 under Linux is built on the concepts of RAID1+0, however, it implements this as a single layer, with multiple possible layouts. The near X layout on Y disks repeats each chunk X times on Y/2 stripes, but does not need X to divide Y evenly. The chunks are placed on almost the same location on each disk they are mirrored on, hence the name. It can work with any number of disks, starting at 2. Near 2 on 2 disks is equivalent to RAID1, near 2 on 4 disks to RAID1+0. The far X layout on Y disks is designed to offer striped read performance on a mirrored array. It accomplishes this by dividing each disk in two sections, say front and back, and what is written to disk 1 front is mirrored in disk 2 back, and vice versa. This has the effect of being able to stripe sequential reads, which is where RAID0 and RAID5 get their performance from. The drawback is that sequential writing has a very slight performance penalty because of the distance the disk needs to seek to the other section of the disk to store the mirror. RAID10 in far 2 layout is, however, preferable to layered RAID1+0 and RAID5 whenever read speeds are of concern and availability / redundancy is crucial. However, it is still not a substitute for backups. See the wikipedia page for more information.
Warning: mdadm cannot reshape arrays in far X layouts which means once the array is created, you will not be able to mdadm —grow it. For example, if you have a 4x1TB RAID10 array and you want to switch to 2TB disks, your usable capacity will remain 2TB. For such use cases, stick to near X layouts.
RAID level comparison
RAID level | Data redundancy | Physical drive utilization | Read performance | Write performance | Min drives |
---|---|---|---|---|---|
0 | No | 100% | nX |
Best; on par with RAID0 but redundant