Which filesystems offer snapshot functionality for users to recover data?
I’m working on a project that will teach linux to youth. Knowing they will have a tendency to delete or corrupt items in their home directories we are looking for a good snapshot option. We will not have access to fancy tools available from major storage vendors and are hoping to find a solution at the file system level. I’ve read a lot about btrfs but have little experience. I have some experience with LVM but I’m unfamiliar with its snapshoting feature. Do either filesystem or another have the option to create snapshots either on demand or scheduled? Then make these snapshot always available without root in like a .snapshot folder in each home folder? Idealy this solutions allows a user to self-restore backups on demand within say a 24 to 48 hour window. We will have another backup process for the system and more global backups. But we do not want this process to be used by students who just make ‘mistakes’.
LVM creates a block device snapshot. The filesystem (most of modern ones) on the partition is being sync ed just before thw snapshot is created. To recover some data from the snapshot, you have to mount it. But be aware that any snapshot has size. The space is used to store differences between the snapshot state and current state, that’s why storing the snapshot as a backup is not a good idea.
@Hub Thank you for the feedback. Perhaps ‘backup’ is the wrong word. As this will not be a primary form of backup or storage. We recognize that space will be required for diffs or whatever mechnicsm the filesystem/volume manger uses. The goal though is to provide a temporary snapshot so students who make mistakes can quickly revert and retrieve previous information. They will not do best practices of saving their own backups before making changes despite our training and therefore want to help them overcome this.
I don’t have experience of this kind but I think you need something like this: en.wikipedia.org/wiki/Versioning_file_system LVM doesn’t suit because you will have to mount and copy every file (or keep track of changed files) to restore. LVM has no internal «restore».
4 Answers 4
Filesystems
On Linux, btrfs is the simplest option for snapshots within a filesystem. It is reasonably stable and complete as long as you don’t use the RAID features. It does have some fsck and repair tools.
ZFS is another option with good snapshot support, and is now cross-platform for Linux, FreeBSD, etc. It’s actively developed by the OpenZFS project, and is more complete than btrfs.
LVM
This LVM answer has some details on the pro’s and con’s of using LVM snapshots, and some btrfs/ZFS links. With some filesystems (ext4 and XFS), LVM will take care of freezing the FS before it takes the snapshot, but LVM snapshots can have performance problems and still have some bugs.
I don’t think LVM is a great solution for this ‘quick snapshot of user data’ application — it’s still weaker than btrfs or ZFS in 2022.
rsnapshot
You may also want to look at rsnapshot, which is a user-space tool that creates snapshots using any filesystem, without using LVM.
Since rsnapshot uses rsync and stores the snapshots under a series of directories, using hard links between different snapshots if a file has not changed, it can run surprisingly quickly even on reasonably large sets of files.
It is used a lot for backups but can also be used for this sort of user-data snapshot requirement, and with a little setup can enable anyone to restore their snapshotted files, by using read-only NFS or Samba — see this HOWTO section on restoring files. Files can be restored with standard Linux tools as rsnapshot mirrors the source directory into each snapshot directory.
rsnapshot is quite flexible using its standard features, and since it’s written in Perl it’s quite easy to customise it, e.g. if you want to provide on-demand snapshots. The main drawbacks compared to filesystem snapshots are speed and disk space — each file that changes results in a new copy in the snapshot, whereas filesystem snapshots only copy new blocks in the file.
Selection of linux filesystem for snapshots — For file backups on a VM
I’ve used ZFS in the past, in the form of NexentaStor, and that was pretty slick. Besides the RAID management, the snapshots taken were automatically available to me as: «/primary_volume/.zfs/snapshot_name» and it was pretty slick to go & grab a file from X days ago.
Am I looking at a BTRFS implementation, or perhaps an LVM implementation here? Or are there other packaged, ready-to-fly solutions that will fill this void for me?
1 Answer 1
It sounds like you’ve got all the basic options, but there is another option I think you should consider — more on that in a bit. You’ve got the two common enough filesystems that support snapshots (btrfs and ZFS) and device-mapper/LVM snapshots.
- btrfs snapshots work similarly to the ZFS ones you’re already familiar with; you run btrfs subvolume snapshot -r /mountpoint/data «/mountpoint/snapshots/$(date -Is)» or similar to make one, then it’s visible under /mountpoint/snapshots/$(date -Is) . You can also do the root of the filesystem ( /mountpoint ), which works properly. My experience with btrfs is that stable with this usage. It also supports trim, which (if everything else supports it — I’ve personally never used HyperV so can’t say) will used but now freed space to be returned to your hypervisor’s thin pool.
- LVM (device-mapper) snapshots are different — they snapshot the block device. Traditional LVM snapshots cause performance loss (due to copy on write) which may or may not be a problem for backup use. There are also thin pool snapshots, which are newer and avoid that problem. Since they operate at the block device level, when you make a snapshot you’ll be creating a new block device — which you’ll then have to mount to access the snapshoted files.
With both methods you can keep snapshots as long as desired (disk space permitting), remove them in any order, etc. I’d also suggest considering rsync —inplace to reduce the snapshot size. Given the choice between them — I think they’ll all work fine and you should probably pick whatever you/your team is familiar with.
The other approach: You’re currently writing your own backup system. A lot of backup systems already exist, including ones intended to do space-efficient backups to a hard disk like this. Examples include BackupPC, Bacula/Bareos (more focused on tape, but does disk too), BorgBackup, restic, ZBackup, a bunch more. I’d recommend taking a look at the Arch Wiki’s list of synchronization and backup programs.