Linux file system compression

Carles Mateo

Blog on extreme IT, Development, Clouds, SRE, Operations, Start ups, Security, CTO and my thoughts

Creating a compressed filesystem with Linux and ZFS (using just files)

Many times it could be very convenient to have a compressed filesystem, so a system that compresses data in Real Time.

This not only reduces the space used, but increases the IO performance. Or better explained, if you have to write to disk 1GB log file, and it takes 5 seconds, you have a 200MB/s performance. But if you have to write 1GB file, and it takes 0.5 seconds you have 2000MB/s or 2GB/s. However the trick in here is that you really only wrote 100MB, cause the Data was compressed before being written to the disk.

This also works for reading. 100MB are Read, from Disk, and then uncompressed in the memory (using chunks, not everything is loaded at once), assuming same speed for Reading and Writing (that’s usual for sequential access on SAS drives) we have been reading from disk for 0.5 seconds instead of 5. Let’s imagine we have 0.2 seconds of CPU time, used for decompressing. That’s it: 0.7 seconds versus 5 seconds.

So assuming you have installed ZFS in your Desktop computer those instructions will allow you to create a ZFS filesystem, compressed, and mount it.

ZFS can create pools using disks, partitions or other block devices, like regular files or loop devices.

# Create the File that will hold the Filesystem, 1GB

root@xeon:/home/carles# dd if=/dev/zero of=/home/carles/compressedfile.000 bs=1M count=1024 1024+0 records in 1024+0 records out 1073741824 bytes (1.1 GB, 1.0 GiB) copied, 0.621923 s, 1.7 GB/s
zpool create compressedpool /home/carles/compressedfile.000

# If you don’t have automount set, then set the mountpoint

zpool set compressedpool mountpoint=/compressedpool

# Set the compression. LZ4 is fast and well balanced

zfs set compression=lz4 compressedpool

# Push some very compressible 1GB file. Don’t use just 0s as this is optimized 🙂

Читайте также:  Zabbix agent linux настройка

# Myself I copied real logs

ls -al --block-size=M *.log -rw------- 1 carles carles 1329M Sep 26 14:34 messages.log root@xeon:/home/carles# cp messages.log /compressedpool/

Even if the pool only had 1GB we managed to copy 1.33 GB file.

Then we check and only 142MB are being used for real, thanks to the compression.

root@xeon:/home/carles# zfs list NAME USED AVAIL REFER MOUNTPOINT compressedpool 142M 738M 141M /compressedpool root@xeon:/home/carles# df /compressedpool Filesystem 1K-blocks Used Available Use% Mounted on compressedpool 899584 144000 755584 17% /compressedpool

By default ZFS will only import the pools that are based on drives, so in order to import your pool based on files after you reboot or did zfs export compressedpool, you must specify the directory:

zpool import -d /home/carles compressedpool

You can also create a pool using several files from different hard drives. That way you can create mirror, RAIDZ1, RAIDZ2 or RAIDZ3 and not losing any data in that pool based on drives in case you loss a physical drive.

If you use one file in several hard drive, you are aggregating the bandwidth.

You can also do this in your instances or VMs. Create one file of 1GB and creating the pool for compressed logs or compressed core dumps. If later you need more space you can add another file to he pool. You don’t need to use any redundancy, just creating a pool with mountpoint /var/log or /var/core and grow as you need.

Logs and core dumps can be greatly compressed, for example a core dump of 54MB will be around 645KB if you compress it using a tool like bzip2. Using the compression from ZFS, you can choose different algorithms of compression, so expect a massive reduction of space and huge space savings for logs and core dumps.

Читайте также:  Linux включить подсветку клавиатуры

This entry was posted in Performance, Storage and tagged Compression, LZ4, ZFS on 2018-09-26 by Carles Mateo . Views: 8,986 views

Rules for writing a Comment

  1. Comments are moderated
  2. I don’t publish Spam
  3. Comments with a fake email are not published
  4. Disrespectful comments are not published, even if they have a valid point
  5. Please try to read all the article before asking, as in many cases questions are already responded

Leave a Reply Cancel reply

You must be logged in to post a comment.

Blog running since 2013 February

Disclaimer

Opinions are my personal views, as Human Being and as Engineer.
I’m not the spokesperson for any company.
Any views, technology techniques or tricks expressed or explained in here are written on my own behalf, and so will not represent the position, or methodologies, of my current or former employers.

Social

Twitch:
https://www.twitch.tv/carlesmateo_com
I stream Python programming, refactor, Unit Testing, Linux, Docker.

Buy Automating and Provisioning to Amazon AWS with Python SDK boto3

Recent Posts

Other Carles’ projects

Cmemgzip — Compress Logs in memory when your Server ran out of space and save the day
Cmips — Benchmarking the Cloud
Cassandra Driver — A python Web Gateway to query Cassandra from XML, CSV, or JSon
Catalonia Framework — A lightweight PHP Framework
C-Client — An encrypted Messenger for people and companies
PrototypeC — A cheap tiny portable Linux laptop that weights 160 g.
CTop.py — Open Source Python Monitoring tool for Engineering Operations and SRE.
MySql Proxy Cache — A High Performance TCP/Ip Proxy Cache for Mysql, and Query debugger.

Buy Docker Combat Guide

Recent Comments

Buy Python 3 Combat Guide, by Carles Mateo

Other Engineering Blogs I like

Buy Python 3 Exercises for Beginners

Archives

  • November 2022 (2)
  • October 2022 (5)
  • September 2022 (4)
  • August 2022 (4)
  • July 2022 (11)
  • June 2022 (9)
  • May 2022 (11)
  • April 2022 (5)
  • March 2022 (5)
  • February 2022 (3)
  • January 2022 (2)
  • December 2021 (3)
  • November 2021 (4)
  • October 2021 (6)
  • September 2021 (6)
  • August 2021 (4)
  • July 2021 (4)
  • June 2021 (3)
  • May 2021 (4)
  • April 2021 (1)
  • March 2021 (9)
  • February 2021 (4)
  • January 2021 (6)
  • December 2020 (3)
  • November 2020 (5)
  • October 2020 (4)
  • September 2020 (3)
  • August 2020 (5)
  • July 2020 (1)
  • June 2020 (1)
  • May 2020 (5)
  • April 2020 (3)
  • March 2020 (9)
  • February 2020 (1)
  • January 2020 (1)
  • October 2019 (1)
  • September 2019 (1)
  • August 2019 (3)
  • July 2019 (2)
  • June 2019 (3)
  • May 2019 (1)
  • April 2019 (4)
  • March 2019 (1)
  • February 2019 (1)
  • November 2018 (2)
  • October 2018 (2)
  • September 2018 (2)
  • July 2018 (1)
  • June 2018 (1)
  • May 2018 (1)
  • March 2018 (1)
  • March 2017 (1)
  • June 2016 (1)
  • March 2016 (1)
  • August 2015 (1)
  • June 2015 (1)
  • May 2015 (2)
  • April 2015 (2)
  • February 2015 (1)
  • January 2015 (1)
  • October 2014 (1)
  • August 2014 (2)
  • July 2014 (1)
  • March 2014 (1)
  • February 2014 (1)
  • November 2013 (1)
  • September 2013 (1)
  • August 2013 (2)
  • April 2013 (1)
  • February 2013 (1)
Читайте также:  Linux usb device descriptor read 64

Categories

  • Algorithms (4)
  • Amazon EC2 (11)
  • Bandwidth (8)
  • Books (9)
  • Business (4)
  • Carles Mateo in the News, Radio, Conferences (27)
  • Casual tech (14)
  • CI/CD (1)
  • Cloud providers (22)
    • Amazon Cloud (8)
    • Digital Ocean (6)
    • Google Cloud (4)
    • Microsoft Azure (1)
    • WordPress (5)
    • MySQL (8)
    • OpenLDAP (1)
    • Oracle (1)
    • Laptops (1)
    • NAS (7)
    • Raspberry Pi 2 (1)
    • Raspberry Pi 4 (3)
    • Smartphones (5)
    • Storage (26)
      • Erasure Coding (3)
      • MDRAID (4)
      • NAS (2)
      • NFS (3)
      • ZFS (10)
      • RabbitMQ (1)
      • CentOS (1)
      • Ubuntu Linux (17)
      • Windows 10 Pro (5)
      • Bash (12)
      • C (2)
      • Java (3)
      • JavaScript (2)
      • Microservices (3)
      • PHP (9)
      • Python (53)
      • Service-based Architecture (1)
      • Unit Testing (4)
      • Post-Mortem Analysis (4)
      • Catalonia (1)
      • Ireland (2)
      • Docker Containers (25)
      • Hyper-V (1)
      • VirtualBox (3)
      • VMware (2)
      • CDN (1)

      Источник

Оцените статью
Adblock
detector