raid advice with SSD and two HDD
I have a new machine with one 128GB SSD and two 1TB HDD. On the SSD is the OS and my initial thought was to put the two HDD in RAID 1 for user data. After some more thought I came up with two other setups and now I’m in doubt 🙂 Can someone advise what would be the best setup? 1: single SSD and HDD in RAID 1 (original thought) 2: Create 2 partitions on the HDD (128GB and 872GB). Put the two 872GB in RAID 1 and create another RAID 1 with the SSD and one 128GB HDD partition. 3: Create 2 partitions on the HDD (750/250), put the 705GB in RAID 1 and use the 2 250GB as backup and make automatic snapshots of the SSD to (one of) these partitions. I think the 2 main questions are: Is it advisable to create a raid array with only part of a drive and actively use the other part of that drive or should you always use the full disk? Is it advisable to create a raid 1 array with a SSD and HDD or will that blow the whole speed advantage of the SSD?
5 Answers 5
I would like to add a different recommendation to the pool of possible solutions. I would recommend you to base your setup on btrfs’ subvolume and snapshot abilities in combination with a btrbk cronjob.
Setting this up is not necessarily trivial but largely depends on your skill set and previous experience. There is a lot of literature on the net that will help you. In the end you are rewarded with a very flexible and fast way to backup your SSD regularly with a lot of options to create a solution perfect for your needs.
Note of caution: Any form of raid can and should never replace regular backups. (Luckily btrbk can easily be extended to external drives or ssh reachable host, see its manual)
General setup
The general ideas of my proposal is to use the SSD as your btrfs system drive that contains your root and related subvolumes and the two HDDs in a btrfs raid1 as your data and backup drive.
Together with btrbk this will allow you to perform automated incremental backups of your system SSD to your backup HDDs. And since the HDDs are set up as a mirror, all your backup will be kept mirrored as well.
Furthermore btrfs ability to send and receive subvolumes (which is what btrbk utilizes to make backups) allow you to freely move your data and backups between your system and data drive. This will allow you to change what data is stored on the fast SSD, while always maintaining versioned and mirrored backups of all your data.
Setup btrfs
To get started you will either need to reinstall Ubuntu onto the SSD and selecting btrfs as your root file system or to convert your existing installations file system to btrfs. Both ways are described on the Ubuntu’s community help page about btrfs, which is a good read in general, if you are just getting start with btrfs on Ubuntu.
Next you need to turn the HDDs into a btrfs raid1, with the following command, where /dev/sdx and /dev/sdy are the two drives (All data on these drives will be lost!):
mkfs.btrfs -d raid1 /dev/sdx /dev/sdy
If you are new to snapshots or btrfs, I would recommend the this moment to familiarize yourself with the difference between folders, subvolumes and snapshots and try out some of the commands, before any actual data is written to your raid1.
There are many ways you can organize your data and you can find some examples in the btrfs kernel wiki’s sysadmin guide.
One way to do it is to mount your btrfs root ( subvolid 0 or 5 ) somewhere and use that to manage your subvolumes and snapshot and furthermore store all of the data in appropriate subvolumes, which you mount to convenient locations in your file system. That way you can snapshot, move, recover and replace any data at will.
For your concrete example, that could mean the following (all command should be run as root/with sudo):
- Mount your system btrfs root ( subvolid=0 ) to /btrfs/system
- Mount your data btrfs raid1 root ( subvolid=0 ) to /btrfs/data
Instead of mounting these volumes by hand, add them to your fstab ( /etc/fstab ) before mounting, so they are mounted at boot as well. I would recommend to mount them by their UUID, which you can retrieve by running sudo btrfs filesystem show .
UUID= /btrfs/system btrfs defaults,subvolid=0 0 0 UUID= /btrfs/data btrfs defaults,subvolid=0 0 0
sudo mkdir /btrfs sudo mkdir /btrfs/data sudo mount /btrfs/data sudo mkdir /btrfs/system sudo mount /btrfs/system
Now you can any additional subvolumes you might want to each of the btrfs filesystems. Ubuntu normally creates a subvolume for your root / ( subvol=@ ) and home directory /home ( subvol=@home ) by default. It is common to turn /var or /tmp into their own subvolumes or to create application specific subvolumes, e.g. for /var/www/ .
Personally I prefer to keep all my subvolumes at the btrfs root and them mount them to their specific locations using mount and fstab entries.
For example, to create a subvolume for your music collection on the HDD raid1, I would do the following:
btrfs subvolume create /btrfs/data/@music
I would then mount it with the following fstab entry to /music :
UUID= /music btrfs defaults,subvol=@music 0 0
Setup of btrbk
Secondly, you will have to set up btrbk for the subvolumes you want to snapshot and backup onto the HDD raid.
As a simple example on how to backup @ and @home regularly and keeping regular spaced history of your backup you could write the following to /etc/btrbk.conf :
# The long timestamp recommended for more then one snapshot a day timestamp_format long # Set time spacing of snapshots kept on SSD snapshot_preserve_min 2d snapshot_preserve 7d 4w 3m # Set time spacing of snapshots kept on HDD raid target_preserve_min no target_preserve 8w *m snapshot_dir /btrfs/system/snapshots volume /btrfs/system subvolume @ target send-receive /btrfs/data/backup/ subvolume @home target send-receive /btrfs/data/backup/
Please read the btrbk documentation for any of the details. It will also explain how to recover your data from a snapshot.
Lastly you will have to add btrbk to your crontab with sudo crontab -e . E.g. to run your btrbk snapshots and backup every day at noon, add the following line:
Other considerations
Swap
While in general there is less and less need for swap space in modern personal computer systems that have at least 8 GB of RAM, there are still use cases where it can help you out, especially when located on a SSD, where the performance hit of swapping is not as noticeable. It is therefore still generally recommended to set up a swap file or partition.
That being said, btrfs does not support swap files. That means, you will have to allocate some of your SSD space into a separate swap partition if you want to be able to use swap on your system at all.
btrfs SSD detection
Btrfs automatically detects if a file system mounted is located on a SSD and enables wear-leveling in that case.
This however is not necessary anymore, as modern SSDs automatically wear-level themselves, while at the same time causing issues with the fragmentation of free space. I would therefore personally advice you to mount your SSD with the nossd option.
More details can be found in the btrfs kernel wiki.
File system compression
Btrfs supports transparent file compression. By adding the compress option to your mount flags, it will be enable for all new files written.
For example, to enable compression for the @music subvolume I used as an example earlier, I would change my fstab entry to:
UUID= /music btrfs defaults,compress,subvol=@music 0 0
To apply this change, do not forget to remount (unmount and mount again) the effected subvolume.
With most recent computers the overhead of compressing files before it is being writing and after it has been read is often negligible. Writing a large but well compressible file to a slow disk might even be faster with compression enabled.
If you are worried about speed, you can also use the faster but less efficient LZO compression with compress=lzo .
More details once again can be found in the btrfs kernel wiki.
Location of home folder
Ubuntu puts your home folder into the @home subvolume on the system drive by default. If you move home to your HDD mirror is up to your personal preferences. You can also keep it on the SSD and include it into your btrbk backup as I showed above.
Time spacing of backups
While my example set the same retention times for all subvolumes, btrbk allows you to set this time for each of them individually.
You could also run btrbk with different configs (see -c option) at different intervals to gain even more control over when what subvolume is snapshotted amd/or backuped.
As snapshots are quickly created, thanks to btrfs copy-on-write nature, you could even get btrbk to create a snapshot each hour, but only transfer them to the backup disk once a day, for folders that are changed often, like your home folder.
Detecting bit rot
To make use of btrfs bit rot/data corruption detection (and automatic repair in case of your raid1), you should make sure to run a btrfs scrub at regular intervals, e.g. as a cron job, with:
btrfs scrub start /btrfs/system btrfs scrub start /btrfs/data
Linux RAID Performance On NVMe M.2 SSDs With EXT4, Btrfs, F2FS
For boosting the I/O performance of the AMD EPYC 7601 Tyan server I decided to play around with a Linux RAID setup this weekend using two NVMe M.2 SSDs. This is our first time running some Linux RAID benchmarks of NVMe M.2 SSDs and for this comparison were tests of EXT4 and F2FS with MDADM soft RAID as well as with Btrfs using its built-in native RAID capabilities for some interesting weekend benchmarks.
The Tyan Transport SX TN70A-B8026 2U server arrived with a Samung 850 PRO SATA 3.0 SSD, but for pushing things further, I picked up two additional Corsair MP500 NVMe SSDs. I have a half dozen or so Corsair MP500 NVMe drives in different benchmarking systems in the past few months and when it comes to budget-friendly but performant NVMe drives, the MP500 has been among the best at this point. The 120GB version can be found for about $100 USD and that’s sufficient capacity for these benchmark systems and delivers a nice boost in I/O performance over SATA SSDs.
The TN70A-B8026 does offer 24 x 2.5-inch NVMe hot-swap bays for really maximizing the I/O ability and is made possible by the 128 PCI-E 3.0 lanes of EPYC. Unfortunately I’m still working on acquiring some enterprise-grade 2.5-inch NVMe drives (or a number of M.2 to 2.5-inch adapters) but for the time being the Corsair MP500 in RAID should offer a nice I/O performance boost. The CSSD-F120GBMP500 is rated for maximum sequential reads up to 3,000 MB/s, maximum sequential writes up to 2,400 MB/s, random reads up to 150K IOPS, random writes up to 90K IOPS, and 175TBW endurance.
With the two Corsair Force MP500 drives installing in this AMD EPYC server running Ubuntu x86_64 with the Linux 4.13 kernel, I ran a variety of different benchmarks for reference purposes:
— Samsung 850 256GB EXT4
— Force MP500 120GB EXT4
— Force MP500 120GB F2FS
— Force MP500 120GB Btrfs
— Force MP500 120GB Btrfs + LZO
— 2 x 120GB Force MP500 EXT4 RAID0
— 2 x 120GB Force MP500 EXT4 RAID1
— 2 x 120GB Force MP500 F2FS RAID0
— 2 x 120GB Force MP500 F2FS RAID1
— 2 x 120GB Force MP500 Btrfs RAID1
A few notes about the selection: I didn’t run any OpenZFS benchmarks for this testing as I will save it for its own article with ZFS being an out-of-tree Linux file-system while here just sticking to the interesting file-systems with Linux 4.13. You will notice the RAID0 tests are not there for Btrfs. Btrfs on Linux 4.13 with RAID0 ended up being very unstable and would result in the system crashing. This came as a surprise as usually Btrfs RAID0/RAID1 has been very solid (normally its with Btrfs RAID5/RAID6 where it can be risky), at least with SATA SSDs, but this was not the case today. I also did a Btrfs NVMe run for reference with the LZO native compression enabled by default.
Each of these file-systems were tested out-of-the-box with their stock mount options on Linux 4.13 except for where otherwise noted (e.g. LZO). All of these Linux I/O benchmarks were carried out in a fully-automated and reproducible manner using the Phoronix Test Suite.