- Learn How to Generate and Verify Files with MD5 Checksum in Linux
- What is Checksum? How to Check if a File was Modified Using the cksum Command in Linux
- How to Check if a File Was Modified by Checking Modification Time
- How to Check if a File Was Modified by Checking File Size
- What is Checksum in Linux?
- How to Find the Checksum in Linux using cksum
- Syntax of cksum
- How to Use cksum
- Conclusion
- How to Verify Checksum on Linux [Beginner Guide]
- How is a Checksum generated?
- Installing GtkHash on Ubuntu
- Using GtkHash
- Verify checksums via Linux command line
- Generating and Verifying SHA256 Checksum with sha256sum
- How accurately does this work?
Learn How to Generate and Verify Files with MD5 Checksum in Linux
A checksum is a digit which serves as a sum of correct digits in data, which can be used later to detect errors in the data during storage or transmission. MD5 (Message Digest 5) sums can be used as a checksum to verify files or strings in a Linux file system.
MD5 Sums are 128-bit character strings (numerals and letters) resulting from running the MD5 algorithm against a specific file. The MD5 algorithm is a popular hash function that generates 128-bit message digest referred to as a hash value, and when you generate one for a particular file, it is precisely unchanged on any machine no matter the number of times it is generated.
It is normally very difficult to find two distinct files that results in same strings. Therefore, you can use md5sum to check digital data integrity by determining that a file or ISO you downloaded is a bit-for-bit copy of the remote file or ISO.
In Linux, the md5sum program computes and checks MD5 hash values of a file. It is a constituent of GNU Core Utilities package, therefore comes pre-installed on most, if not all Linux distributions.
Take a look at the contents of /etc/group saved as groups.cvs below.
root:x:0: daemon:x:1: bin:x:2: sys:x:3: adm:x:4:syslog,aaronkilik tty:x:5: disk:x:6: lp:x:7: mail:x:8: news:x:9: uucp:x:10: man:x:12: proxy:x:13: kmem:x:15: dialout:x:20: fax:x:21: voice:x:22: cdrom:x:24:aaronkilik floppy:x:25: tape:x:26: sudo:x:27:aaronkilik audio:x:29:pulse dip:x:30:aaronkilik
The md5sums command below will generate a hash value for the file as follows:
$ md5sum groups.csv bc527343c7ffc103111f3a694b004e2f groups.csv
When you attempt to alter the contents of the file by removing the first line, root:x:0: and then run the command for a second time, try to observe the hash value:
$ md5sum groups.csv 46798b5cfca45c46a84b7419f8b74735 groups.csv
You will notice that the hash value has now changed, indicating that the contents of the file where altered.
Now, put back the first line of the file, root:x:0: and rename it to group_file.txt and run the command below to generate its hash value again:
$ md5sum groups_list.txt bc527343c7ffc103111f3a694b004e2f groups_list.txt
From the output above, the hash value is still the same even when the file has been renamed, with its original content.
Important: md5 sums only verifies/works with the file content rather than the file name.
The file groups_list.txt is a duplicate of groups.csv, so, try to generate the hash value of the files at the same time as follows.
You will see that they both have equal hash values, this is because they have the exact same content.
$ md5sum groups_list.txt groups.csv bc527343c7ffc103111f3a694b004e2f groups_list.txt bc527343c7ffc103111f3a694b004e2f groups.csv
You can redirect the hash value(s) of a file(s) into a text file and store, share them with others. For the two files above, you can issues the command below to redirect generated hash values into a text file for later use:
$ md5sum groups_list.txt groups.csv > myfiles.md5
To check that the files have not been modified since you created the checksum, run the next command. You should be able to view the name of each file along with “OK”.
The -c or —check option tells md5sums command to read MD5 sums from the files and check them.
$ md5sum -c myfiles.md5 groups_list.txt: OK groups.csv: OK
Remember that after creating the checksum, you can not rename the files or else you get a “No such file or directory” error, when you try to verify the files with new names.
$ mv groups_list.txt new.txt $ mv groups.csv file.txt $ md5sum -c myfiles.md5
md5sum: groups_list.txt: No such file or directory groups_list.txt: FAILED open or read md5sum: groups.csv: No such file or directory groups.csv: FAILED open or read md5sum: WARNING: 2 listed files could not be read
The concept also works for strings alike, in the commands below, -n means do not output the trailing newline:
$ echo -n "Tecmint How-Tos" | md5sum - afc7cb02baab440a6e64de1a5b0d0f1b -
$ echo -n "Tecmint How-To" | md5sum - 65136cb527bff5ed8615bd1959b0a248 -
In this guide, I showed you how to generate hash values for files, create a checksum for later verification of file integrity in Linux. Although security vulnerabilities in the MD5 algorithm have been detected, MD5 hashes still remains useful especially if you trust the party that creates them.
Verifying files is therefore an important aspect of file handling on your systems to avoid downloading, storing or sharing corrupted files. Last but not least, as usual reach us by means of the comment form below to seek any assistance, you can as well make some important suggestions to improve this post.
What is Checksum? How to Check if a File was Modified Using the cksum Command in Linux
Zaira Hira
When you are working with files on the command line, you might need to check their modification time and content integrity.
Linux has a powerful command line which allows you to explore multiple aspects of files and filesystems.
In case you need to check if file was modified, you can follow these two approaches:
How to Check if a File Was Modified by Checking Modification Time
When a file is edited, its timestamp changes to match the modification time.
We can view the last modified time of a file using long listing( ls -l ).
In the output below, we can see that the file was modified on Jul 19 13:22 .
zaira@Zaira:~$ ls -lrt | grep calculator.py -rw-r--r-- 1 zaira zaira 263 Jul 19 13:22 calculator.py
How to Check if a File Was Modified by Checking File Size
If we know the previous size of the file, we can compare it with the current file size to see if was changed.
We can view the file size using long listing( ls -l ). The 5th column shows the size of the file in bytes.
zaira@Zaira:~$ ls -lrt | grep calculator.py -rw-r--r-- 1 zaira zaira 263 Jul 19 13:22 calculator.py
The methods mentioned above usually get the task done, but there is an advanced method to check file integrity using a hash. The method is called ‘checksum’ and the corresponding command for that in Linux is cksum .
What is Checksum in Linux?
Sometimes the data gets corrupted during transmission or storage. To ensure that the data remains consistent, we can use checksum.
Checksum is the result of an algorithm called a cryptographic hash function. It’s applied to blocks of the data in the file.
In networking, you can use checksum to compare the hash value at sender and receiver ends. If the hash value is same, it implies that your copy of the file is genuine and error free.
Some commonly used cryptographic hash functions include MD5 and SHA-1.
Next we will see how we can calculate the hash in Linux.
How to Find the Checksum in Linux using cksum
cksum is a command found in *nix-like operating systems that generates a checksum value for a file or stream of data.
According to the man page of cksum , the command prints CRC (cyclic redundancy check) checksum and byte counts of each FILE.
To learn more about the CRC algorithm, refer to this page.
Syntax of cksum
The cksum command takes the filename as an argument and generates its hash value. The basic syntax is as follows:
How to Use cksum
Let’s suppose we have a file named calculator.py . We can calculate its checksum like this:
zaira@Zaira:~$ cksum calculator.py 1991291549 262 calculator.py
In the output, we get three columns:
- The first column is the hash value.
- The second value is the amount of data in bytes for the given file.
- The third column is the file name.
Even a slight modification changes the hash value. Let’s see how that looks with an example.
Let’s modify our original file calculator.py by adding an extra line at the end:
zaira@Zaira:~$ echo >> "this file is now changed" >> calculator.py
Let’s calculate the checksum again and see if the hash value has changed:
zaira@Zaira:~$ cksum calculator.py 331872555 263 calculator.py
The first column is the hash value and it has changed since we appended the text.
Now we know that the file has changed as the checksum hash values are no longer the same.
We can use the same method to compare files with the same name, size, and modification time across different machines to ensure that both files are the same.
Conclusion
There are cases when you need to compare the files across systems, specially when they are transferred from one location to the other. We can use a combination of the three methods to verify if our file is intact:
- Viewing the file modification time.
- Verifying the file size.
- Generating and comparing the hash value using cksum .
I hope you found this tutorial helpful. Thank you for reading till the end.
What’s your favorite thing you learned from this tutorial? Let me know on Twitter!
You can also read my other posts here.
How to Verify Checksum on Linux [Beginner Guide]
A checksum is a small-sized datum from a block of digital data for the purpose of detecting errors which may have been introduced during its transmission or storage.
So a checksum is a long string of data containing various letters and numbers. You’ll generally find them when downloading files from the web, e.g. Linux distribution images, software packages, etc.
The most common use of checksums is for checking if a downloaded file is corrupted.
For instance, the Ubuntu MATE download page includes an SHA-256 checksum for every image it makes available. So after you’ve downloaded an image, you can generate an SHA-256 checksum for it and verify that the checksum value matches the one listed on the site.
If it doesn’t, that means your downloaded image’s integrity is compromised (maybe it was corrupted during the download process). We will use an Ubuntu MATE “ubuntu-mate-16.10-desktop-amd64.iso” image file for this guide.
How is a Checksum generated?
Each checksum is generated by a checksum algorithm. Without going into the technical details let’s just say it takes a file as input and outputs the checksum value of that file. There are various algorithms for generating checksums. The most popular checksum algorithms are:
Let’s see how to verify a checksum on Linux.
Installing GtkHash on Ubuntu
To install GtkHash on your Ubuntu system, simply run the following command:
That’s it. Then select the checksum algorithms to use:
- Go to Edit >Preferences in the menu.
- Select the ones you’d like to use.
- Hit the Close button.
By default, MD5, SHA-1 and SHA256 are selected.
Using GtkHash
Using it is quite straight-forward.
- Select the file you want to check.
- Get the Checksum value from the website and put it in the Check box.
- Click the Hash button.
- This will generate the checksum values with the algorithms you selected.
- If any one of them matches with the Check box, it will show a small tick sign beside it.
Here’s an example showing GtkHash generating a checksum for the Ubuntu MATE iso image (ubuntu-mate-16.10-desktop-amd64.iso):
Verify checksums via Linux command line
Every Linux distribution comes with tools for various checksum algorithms. You can generate and verify checksums with them. The command-line checksum tools are the following:
- MD5 checksum tool is called md5sum
- SHA-1 checksum tool is called sha1sum
- SHA-256 checksum tool is called sha256sum
There are some more available, e.g. sha224sum, sha384sum, etc. All of them use similar command formats. Let’s see an example using sha256sum. We’ll use the same “ubuntu-mate-16.10-desktop-amd64.iso” image file that we used before.
Generating and Verifying SHA256 Checksum with sha256sum
First go to the directory where the .iso image is stored:
Now, to generate the SHA-256 checksum, enter the following command:
sha256sum ubuntu-mate-16.10-desktop-amd64.iso
You’ll see the SHA-256 checksum in your terminal window! Easy, isn’t it?
If the generated checksum matches the one provided on the Ubuntu MATE download page, that will mean no data was changed while you downloaded the file – in other words, your downloaded file is not corrupted.
The other tools mentioned work similarly.
How accurately does this work?
If you’re wondering how accurately these checksums detect corrupted files – if you delete or change even one character from any one of the text files inside the iso image, the checksum algorithm will generate a totally different value for that changed image. And that will definitely not match the checksum provided on the download page.
Do you checksum?
One of the suggested steps while installing Linux is to verify the checksum of your Linux ISO. Do you always follow this step or do you do it only when something goes wrong with the installation?
Was this guide helpful? If you have any questions, let us know! And if you need a similar guide for something else, reach out to us, we’re here to help.