Linux lvm with ssd

SSD cache device to a software RAID5 using LVM2

Any IT here? Help Me!

And now we show you how to do software RAID5 with SSD cache nvme using LVM2.

The goal:
Caching RAID5 consisting of three 8T hard drives with a single 1T NVME SSD drive. Caching reads, i.e. the write-through is enabled ().
Our setup:

  • 1 NVME SSD disk Samsung 1T. It will be used for writethrough cache device (you may use writeback, too, you do not care for the data if the cache device fails)!
  • 3 Hard disk drive 8T grouped in RAID5 for redundancy.

STEP 1) Install lvm2 and enable the lvm2 service

Only this step is different on different Linux distributions. We included three of them:
Ubuntu 16+:

sudo apt update && apt upgrade -y sudo apt install lvm2 -y systemctl enable lvm2-lvmetad systemctl start lvm2-lvmetad
yum update yum install -y lvm2 systemctl enable lvm2-lvmetad systemctl start lvm2-lvmetad
emerge --sync emerge -v sys-fs/lvm2 /etc/init.d/lvm start rc-update add default lvm

STEP 2) Add the four partitions to the lvm2.

Three partitions from the hard drives and one from the NVME SSD (the cache device). We have set up a partition in the NVME SSD device to occupy 100% of the space (but you may use 90% of the space to have a better SSD endurance and in many cases performance).
The devices are “/dev/sda5”, “/dev/sdb5”, “/dev/sdc5” (the first 4 partitions are occupied by the grub, boot, swap, root partitions of our CentOS 7 Linux distribution if you wonder why we use /dev/sd[X]5 ) and “/dev/nvme0n1p1”:

[root@srv ~]# parted /dev/sda GNU Parted 3.2 Using /dev/sda Welcome to GNU Parted! Type 'help' to view a list of commands. (parted) p Model: ATA HGST HUH721008AL (scsi) Disk /dev/sda: 8002GB Sector size (logical/physical): 512B/4096B Partition Table: gpt Disk Flags: pmbr_boot Number Start End Size File system Name Flags 4 1049kB 2097kB 1049kB bios_grub 1 2097kB 34.4GB 34.4GB raid 2 34.4GB 34.9GB 537MB raid 3 34.9GB 88.6GB 53.7GB raid 5 88.6GB 8002GB 7913GB primary raid (parted) q [root@srv ~]# parted /dev/sdb GNU Parted 3.2 Using /dev/sdb Welcome to GNU Parted! Type 'help' to view a list of commands. (parted) p Model: ATA HGST HUH721008AL (scsi) Disk /dev/sdb: 8002GB Sector size (logical/physical): 512B/4096B Partition Table: gpt Disk Flags: pmbr_boot Number Start End Size File system Name Flags 4 1049kB 2097kB 1049kB bios_grub 1 2097kB 34.4GB 34.4GB raid 2 34.4GB 34.9GB 537MB raid 3 34.9GB 88.6GB 53.7GB raid 5 88.6GB 8002GB 7913GB primary raid (parted) q [root@srv ~]# parted /dev/sdc GNU Parted 3.2 Using /dev/sdc Welcome to GNU Parted! Type 'help' to view a list of commands. (parted) p Model: ATA HGST HUH721008AL (scsi) Disk /dev/sdc: 8002GB Sector size (logical/physical): 512B/4096B Partition Table: gpt Disk Flags: pmbr_boot Number Start End Size File system Name Flags 4 1049kB 2097kB 1049kB xfs bios_grub 1 2097kB 34.4GB 34.4GB raid 2 34.4GB 34.9GB 537MB raid 3 34.9GB 88.6GB 53.7GB raid 5 88.6GB 8002GB 7913GB primary raid (parted) q [root@srv ~]# parted /dev/nvme0n1 GNU Parted 3.2 Using /dev/nvme0n1 Welcome to GNU Parted! Type 'help' to view a list of commands. (parted) p Model: NVMe Device (nvme) Disk /dev/nvme0n1: 1024GB Sector size (logical/physical): 512B/512B Partition Table: gpt Disk Flags: Number Start End Size File system Name Flags 1 1049kB 1024GB 1024GB primary (parted) q

Add partitions to the LVM2 (as physical volumes) and create an LVM Volume Group.

[root@srv ~]# pvcreate /dev/sda5 /dev/sdb5 /dev/sdc5 /dev/nvme0n1p1 Physical volume "/dev/sda5" successfully created. Physical volume "/dev/sdb5" successfully created. Physical volume "/dev/sdc5" successfully created. Physical volume "/dev/nvme0n1p1" successfully created. [root@srv ~]# pvdisplay "/dev/nvme0n1p1" is a new physical volume of "

You may add all the devices (aka partitions) in one line with pvcreate. The pvdisplay will display meta information for the physical volumes (the partitions we’ve just added).

And then create the LVM Volume Group device. The four physical volumes must be in the same group.

[root@logs ~]# vgcreate VG_storage /dev/sda5 /dev/sdb5 /dev/sdc5 /dev/nvme0n1p1 Volume group "VG_storage" successfully created [root@logs ~]# vgdisplay --- Volume group --- VG Name VG_storage System ID Format lvm2 Metadata Areas 4 Metadata Sequence No 1 VG Access read/write VG Status resizable MAX LV 0 Cur LV 0 Open LV 0 Max PV 0 Cur PV 4 Act PV 4 VG Size 22.52 TiB PE Size 4.00 MiB Total PE 5903990 Alloc PE / Size 0 / 0 Free PE / Size 5903990 / 22.52 TiB VG UUID PSqwF4-3WvJ-0EEX-Lb2x-MiAG-25Q0-p2a7Ap

Successfully created and you may verify it with vgdisplay

STEP 3) Create the RAID5 device.

First create the RAID5 device using the three slow hard drive disks and their pertitions “/dev/sda5”, “/dev/sdb5” and “/dev/sdc5”. We want to use all the available space on our slow disks in one logical storage device we use “100%FREE”. The name of the logical device is “lv_slow” hinting it consists of slow disks.

[root@srv ~]# lvcreate --type raid5 -l 100%FREE -I 512 -n lv_slow VG_storage /dev/sda5 /dev/sdb5 /dev/sdc5 Logical volume "lv_slow" created. [root@srv ~]# lvdisplay --- Logical volume --- LV Path /dev/VG_storage/lv_slow LV Name lv_slow VG Name VG_storage LV UUID 5gdDBR-1h7N-WA6j-20Dn-IUQR-Ry61-dcQdoG LV Write Access read/write LV Creation host, time logs.example.com, 2019-11-23 09:35:40 +0000 LV Status available # open 0 LV Size 14.39 TiB Current LE 3773198 Segments 1 Allocation inherit Read ahead sectors auto - currently set to 6144 Block device 253:6

The “-I 512” sets the 512 Kbytes chunk size of the RAID5.
And lvdisplay will show meta information for the successfully created logical volume. Because it is a RAID5 the usable space is “three disks minus one” i.e. 14.39TiB (from 22.52 TiB).

STEP 4) Create the cache pool logical device and then convert the logical slow volume to use the newly create cache pool logical device .

First, create the cache pool logical volume with the name “lv_cache” (to show it’s a fast SSD device). Again, we use 100% available space on the physical volume (100% from the partition we’ve used).

[root@CentOS-82-64-minimal ~]# lvcreate --type cache-pool -l 100%FREE -c 1M --cachemode writethrough -n lv_cache VG_storage /dev/nvme0n1p1 Logical volume "lv_cache" created. [root@CentOS-82-64-minimal ~]# lvdisplay --- Logical volume --- LV Path /dev/VG_storage/lv_slow LV Name lv_slow VG Name VG_storage LV UUID 5gdDBR-1h7N-WA6j-20Dn-IUQR-Ry61-dcQdoG LV Write Access read/write LV Creation host, time logs.example.com, 2019-11-23 09:35:40 +0000 LV Status available # open 0 LV Size 14.39 TiB Current LE 3773198 Segments 1 Allocation inherit Read ahead sectors auto - currently set to 6144 Block device 253:6 --- Logical volume --- LV Path /dev/VG_storage/lv_cache LV Name lv_cache VG Name VG_storage LV UUID m3h1Gq-8Yd7-WrAd-KqkJ-ljlM-z1zB-7J0Pqi LV Write Access read/write LV Creation host, time logs.example.com, 2019-11-23 09:40:40 +0000 LV Pool metadata lv_cache_cmeta LV Pool data lv_cache_cdata LV Status NOT available LV Size 953.77 GiB Current LE 244166 Segments 1 Allocation inherit Read ahead sectors auto

Verify with “lvdisplay” the cache-pool is created. We set two important parameters the write-through mode enabled. write-through saves your data from a cache device failure. If the data is not so important (like in a proxy cache server?) you may want to replace “writethrough” with “writeback” in the above command.

And now convert the cache device – the slow device (logical volume lv_slow) will have a cache device (logical volume lv_cache):

[root@srv ~]# lvconvert --type cache --cachemode writethrough --cachepool VG_storage/lv_cache VG_storage/lv_slow Do you want wipe existing metadata of cache pool VG_storage/lv_cache? [y/n]: y Logical volume VG_storage/lv_slow is now cached. [root@srv ~]# lvdisplay --- Logical volume --- LV Path /dev/VG_storage/lv_slow LV Name lv_slow VG Name VG_storage LV UUID 5gdDBR-1h7N-WA6j-20Dn-IUQR-Ry61-dcQdoG LV Write Access read/write LV Creation host, time logs.example.com, 2019-11-23 09:35:40 +0000 LV Cache pool name lv_cache LV Cache origin name lv_slow_corig LV Status available # open 0 LV Size 14.39 TiB Cache used blocks 0.01% Cache metadata blocks 16.06% Cache dirty blocks 0.00% Cache read hits/misses 0 / 48 Cache wrt hits/misses 0 / 0 Cache demotions 0 Cache promotions 3 Current LE 3773198 Segments 1 Allocation inherit Read ahead sectors auto - currently set to 8192 Block device 253:6

Note there is only one logical volume device with the name “lv_slow”, but still, you could see there is an additional logical device “inside” the lv_slow device – “lv_cache”. The properties (chunk size and writethrough mode) we’ve set earlier creating the lv_cache a preserved for the new cached lv_slow device (if you use the writeback on creation the command warns that the write-back mode breaks the data redundancy of the RAID5! Be careful with such setups – if write-back is enabled and there is a problem with the cache device (the SSD) you might lose all your data!).

STEP 5) Format and use the volume

Format and do not miss to include it in the /etc/fstab to mount it automatically on boot.

[root@srv ~]# mkfs.ext4 /dev/VG_storage/lv_slow mke2fs 1.44.3 (10-July-2018) Discarding device blocks: done Creating filesystem with 3863754752 4k blocks and 482971648 inodes Filesystem UUID: cbf0e33c-8b89-4b7b-b7dd-1a9429db3987 Superblock backups stored on blocks: 32768, 98304, 163840, 229376, 294912, 819200, 884736, 1605632, 2654208, 4096000, 7962624, 11239424, 20480000, 23887872, 71663616, 78675968, 102400000, 214990848, 512000000, 550731776, 644972544, 1934917632, 2560000000, 3855122432 Allocating group tables: done Writing inode tables: done Creating journal (262144 blocks): done Writing superblocks and filesystem accounting information: done [root@srv ~]# blkid |grep lv_slow /dev/mapper/VG_storage-lv_slow_corig_rimage_0: UUID="cbf0e33c-8b89-4b7b-b7dd-1a9429db3987" TYPE="ext4" /dev/mapper/VG_storage-lv_slow: UUID="cbf0e33c-8b89-4b7b-b7dd-1a9429db3987" TYPE="ext4"

And add it to the /etc/fstab:

UUID=cbf0e33c-8b89-4b7b-b7dd-1a9429db3987 /mnt/storag ext4 defaults,discard,noatime 1 3

And then just execute the mount command with “/mnt/storage” and you are ready to use your RAID1 with SSD cache device:

[root@static ~]# mount /mnt/storage [root@logs ~]# df -h Filesystem Size Used Avail Use% Mounted on devtmpfs 32G 0 32G 0% /dev tmpfs 32G 0 32G 0% /dev/shm tmpfs 32G 804K 32G 1% /run tmpfs 32G 0 32G 0% /sys/fs/cgroup /dev/md2 49G 1.4G 46G 3% / /dev/md1 487M 98M 364M 22% /boot /dev/mapper/VG_storage-lv_slow 15T 21M 15T 1% /mnt/storage tmpfs 6.3G 0 6.3G 0% /run/user/0

Additional LVM information with lvs

After a day the sync (RAID5 needs initial resync) is finished (the column Cpy%Sync is 100.00).

[root@logs ~]# lvs -a LV VG Attr LSize Pool Origin Data% Meta% Move Log Cpy%Sync Convert [lv_cache] VG_storage Cwi---C--- 953.77g 2.40 16.07 0.00 [lv_cache_cdata] VG_storage Cwi-ao---- 953.77g [lv_cache_cmeta] VG_storage ewi-ao---- 48.00m lv_slow VG_storage Cwi-aoC--- 14.39t [lv_cache] [lv_slow_corig] 2.40 16.07 0.00 [lv_slow_corig] VG_storage rwi-aoC--- 14.39t 100.00 [lv_slow_corig_rimage_0] VG_storage iwi-aor--- 

One thought on “SSD cache device to a software RAID5 using LVM2”

Hi, thanks for your post! Is it possible to create a mirrored cache with lvm? (more safe i think)
It should be i possible right? but do you know how to do this?

Leave a Reply Cancel reply

Find Us

Address
101010010100 Main Street
Earth, EA 101010101010100

E-mail
info@ahelpme.com

Hours (in the TimeBank)
1000000:00:0:00:00 in time…

About This Site

Highly skilled hominins inhabited planet Earth a long long ago! And these hominins have a need to share some knowledge here.

Источник

Читайте также:  Linux file system status
Оцените статью
Adblock
detector