best filesystem for millions of files [closed]
Questions on Server Fault must be about managing information technology systems in a business environment. Home and end-user computing questions may be asked on Super User, and questions about development, testing and development tools may be asked on Stack Overflow.
Which Linux filesystem/setup would you choose for the best speed in the following scenario: a few million files ~3mb file size on average random access to files need to get a list of all the files frequently constant writing of new files constant reading of old files
The use case for millions of files in a directory with random access makes me think it would be a better use case for a database. A simple sql database with a primary key and a blob, or a key-value-store database would be basically the same thing as your filesystem, but databases are optimized for that workload.
3 Answers 3
What really counts is how you organize your files.
If you plan to have a single big directory with ~10M files any filesystem will suffer, albeit XFS and ZFS will manage even this worst case quite well.
The recommended approach is to organize your files in multiple, smaller directories, with reasonable file counts (~32K) to avoid different but related issues (ie: ls was once very slow for big directories).
If this is not possible I would go with XFS or ZFS but only after having simulated the intended load on a test setup (note: even EXT4 will be fine performance-wise, but you can hit hard the inode limit).
Your work load is almost the worst possible for a general purpose file system. Millions of files, frequent enumeration, lots of reads and writes. Enormous metadata I/O. With large number of files, it rarely the bandwidth of transferring the file themselves that is the problem, rather the number of IOPS to query directory entries and inodes repeatedly.
Test this workload synthetically, while monitoring the application to be sure performs acceptably. On realistic production scale storage and IOPS levels. Be sure to match the folder structure, 300 files per directory is very different from 3,000,000 files per directory. Try a couple different file systems, for Linux XFS and EXT4.
Possibly you will need very fast SSD storage and lots of RAM to make this perform adequately.
Maybe you have a support contract with your OS vendor where you can have a performance specialist look at it.
If getting acceptable performance demands it, consider application changes. Consider storing and querying the file lists from a database other than the file system. Many databases might be able to return a few million results faster than a file system constrained by POSIX in general and Linux VFS in particular.
What Are the Best Linux Filesystems in 2021?
When formatting a hard disk to install your Linux system, you have to decide on the best Linux filesystem to use. In 2021, the most popular option is EXT4. Is it the best one, though, and if you have alternatives, should you choose them? Let’s see the (possible) options.
EXT4
The “Fourth Extended Filesystem” is fully backward-compatible with EXT2 and EXT3 and is considered the standard for most Linux distributions, remaining as popular as its predecessors.
It is one of the safest and most stable available options today since it supports journaling, preventing (as much as possible) the loss of data after a system crash or loss of power.
Two of its most important features are “extents” and “delayed allocation,” which smartly manage how the data is written on the storage medium to improve performance and reduce fragmentation.
BtrFS
The “b-tree file system” was initially designed by Oracle and has kept increasing in popularity, to the point many consider it the true successor to the EXT dynasty.
Btrfs comes with advanced features, such as automatic defragmentation and transparent compression. It follows a copy-on-write approach, saving new iterations of data and metadata instead of affecting the existing ones (“shadowing”). This also allows for snapshots of different states of the filesystem as well as easy replication, migration, and incremental backups. Online and offline filesystem checks further reduce the possibility of data loss.
BtrFS natively supports RAID, but it doesn’t follow the approach of typical software RAID striping or hardware block mirroring. Instead, it ensures that each block on one device has a copy on another and keeps CRCs for all data. Thus, in case of a failure, it can pull information from backups and checksums to reconstruct corrupted or missing data.
It’s worth noting that BtrFS is also “SSD-friendly” since it automatically disables its features that are useful for mechanical HDDs but could wear out SSDs.
XFS
XFS was created by Silicon Graphics almost three decades ago for their graphics workstations specializing in rendering 3D graphics.
That’s why XFS remains one of the best options for systems that are constantly reading and writing data. Thanks to the use of “allocation groups” – parts of the filesystem that contain their own inodes and free space – it’s possible for multiple threads to read and write data at the same time in parallel. Support for delayed allocation, dynamically allocated inodes, and advanced read-ahead algorithms help it achieve excellent performance, especially on large-scale storage pools up to hundreds of TB in size.
Its support for journaling is restricted, though, compared to more modern alternatives, and it is arguably more prone to data loss. It also doesn’t scale down well for more typical day-to-day and mostly single-threaded scenarios, like when deleting a bunch of photos from your “Pictures” folder. In other words, it’s great if you’re setting up your own datacenter but maybe not for typical personal use.
F2FS
One of the (relatively) newer filesystems, “Flash-Friendly File System” is one of the best options for use with flash-based storage.
Initially created for that purpose by Samsung, F2FS splits the storage medium into smaller parts that contain zones that also include smaller parts, and so on, and tries to use many of them instead of reusing the same ones. Combined with its support for TRIM/FITRIM, this makes it friendlier to flash-based media that comes with a finite number of writes.
There’s no point in deep-diving into F2FS’s features since it doesn’t excel in anything compared to all alternatives as far as speed or data security goes, nor in using it with typical media, where every other filesystem would come with a better feature-set. The story changes if talking about flash-based storage, though, for which it was explicitly created.
OpenZFS/ZFS
OpenZFS is a fork of the Zettabyte File System (ZFS) that initially appeared on Sun’s Solaris. Up to 2010, ZFS could be used on Linux primarily through FUSE, due to licensing issues. It was after 2010 that its development started opening up, and in 2016 Ubuntu supported, by default, its open-source version. Since then, when people refer to “ZFS,” they’re usually talking about its open variant instead of Solaris ZFS – that also keeps evolving but on a parallel path.
ZFS differs from all alternatives in that it combines the filesystem with a volume manager. Because of that, it doesn’t just manage files and directories but also the physical media on which they reside. Thanks to this, every storage device can be assigned to a pool that is treated as a single resource. If you’re ever out of space, you can add new storage to this pool to expand it, letting ZFS take care of the details.
By managing the media itself, ZFS also excels in its support for RAID. You can set up RAID arrays of most types (RAID 0, 2, 5, etc.) but also use its approach of “RAIDZ.” Unlike typical RAID arrays, RAIDZ uses variable width stripes between the drives it includes, increasing its tolerance on data loss after a power failure.
ZFS also follows a copy-on-write approach, where instead of modifying existing data, it only saves the changes (“deltas”) between the old and new versions. This allows for transparent, smart storage of multiple copies of data, without taking up a lot of space, that can work as backups or snapshots. The user can return to previous states of the filesystem, reverting changes, or do the opposite: pull all changes into clones of existing data.
Those are some of the features that help it nearly eliminate any possibility of data loss – at least, in theory.
JFS
The Journaling File System by IBM was one of the first filesystems that supported journaling, leading to reduced chances of data loss. It uses extents like many other modern alternatives and allocation groups like XFS, aiming to offer high read/write performance.
By not prioritizing a single feature, it’s a great all-arounder under different workloads for different needs. Unfortunately, this also means that it doesn’t excel in anything. Plus, it has some problems that many people would consider a negative when choosing a filesystem for their storage. For example, it can delay updating its journal indefinitely, increasing the chances of data loss and almost nullifying the fact that it’s a journaling file system. It’s better at parallel writes that are of most use to servers and large databases but performs worse than EXT4 in more popular desktop usage scenarios.
Those are probably the reasons why it’s not as popular as other filesystems, which can either perform faster or be better shielded against data loss.
Which Should You Use?
There’s a reason EXT4 is the default choice for most Linux distributions. It’s tried, tested, stable, performs great, and is widely supported. If you are looking for stability, EXT4 is the best Linux filesystem for you.
If you aren’t afraid of having to deal with a somewhat less mature ecosystem, though, BtrFS may be the better option for you.
For server use where you want to eliminate almost entirely any possibility of data loss and stability is the name of the game, you may want to look into ZFS. To really take advantage of it, though, prepare for a lot of reading. Thankfully, we can help with its initial setup.
For flash media, F2FS is the best option by default.
Whichever file system you choose, remember to fully erase your HDD beforehand if you want to render its existing content almost unrecoverable.
OK’s real life started at around 10, when he got his first computer — a Commodore 128. Since then, he’s been melting keycaps by typing 24/7, trying to spread The Word Of Tech to anyone interested enough to listen. Or, rather, read.
Our latest tutorials delivered straight to your inbox
What is the most high-performance Linux filesystem for storing a lot of small files (HDD, not SSD)?
I have a directory tree that contains many small files, and a small number of larger files. The average size of a file is about 1 kilobyte. There are 210158 files and directories in the tree (this number was obtained by running find | wc -l ). A small percentage of files gets added/deleted/rewritten several times per week. This applies to the small files, as well as to the (small number of) larger files. The filesystems that I tried (ext4, btrfs) have some problems with positioning of files on disk. Over a longer span of time, the physical positions of files on the disk (rotating media, not solid state disk) are becoming more randomly distributed. The negative consequence of this random distribution is that the filesystem is getting slower (such as: 4 times slower than a fresh filesystem). Is there a Linux filesystem (or a method of filesystem maintenance) that does not suffer from this performance degradation and is able to maintain a stable performance profile on a rotating media? The filesystem may run on Fuse, but it needs to be reliable.
If you know which files are going to be big/not changing very often, and which are going to be small/frequently changing, you might want to create two filesystems with different options on them, more suited to each scenario. If you need them to be accessible as they were a part of the same structure, you can do some tricks with mount, symlinks.
I am quiet surprised to know that btrfs(with copy-on-write feature) has been sluggish to you over a period of time. I am curious to have the results shared from you, possibly helping each other in new direction of performance tuning with it.
there is a new animal online zfs on Linux, available in native mode and fuse implementations, incase you wanted to have a look.
I tried zfs on linux once, was quite unstable. Managed to completely lock up the filesystem quite often. Box would work, but any access to the FS would hang.