- How to split and join files in Linux
- What is split?
- What is cat?
- How to split and join files in Linux using split and cat
- How do I split a file into parts in Linux
- How to split files by size in Linux:
- How to split files by content in Linux using csplit:
- How to combine or join files back:
- Conclusion:
- About the author
- David Adams
- Как разделить большой файл на части
- Как разделить файл на части
- Как объединить файлы в один
- Как разбить текстовый файл по строкам
- Заключение
How to split and join files in Linux
Dividing and joining files in Linux is a fairly simple task that will allow us to fragment a file into several smaller files, this helps us on many occasions to fragment files that take up a lot of memory space, either to transport it on external storage units or for security policies such as maintaining fragmented and distributed copies of our data. For this simple process we will use two important commands split and cat.
What is split?
It’s a command for systems Unix that allows us to divide a file into several smaller ones, it creates a series of files with the extension and a correlative of the original file name, being able to parameterize the size of the resulting files.
To delve into the scope and characteristics of this command we can execute man split where we can see its detailed documentation
What is cat?
For his part, linux cat command allows you to concatenate and display files, easily and efficiently, that is, with this command we can view various text files and we can also concatenate divided files.
In the same way as with split we can view the detailed documentation of cat with the command man cat.
How to split and join files in Linux using split and cat
Once you know the basics of the split and cat commands, it will be fairly easy to split and join files in Linux. For a general example where we want to divide a file called test.7z that weighs 500mb into several 100mb files, we simply have to execute the following command:
$ split -b 100m tes.7z dividido
This command will return 5 files of 100 mb resulting from the original file, which will have the name dividedaa, dividedab and so on. It is worth noting that if we add the parameter -d to the previous instruction the name of the resulting files would be numeric, that is, divided01, divided02 .
$ split -b -d 100m tes.7z dividido
Now, to rejoin the files that we have divided, we just have to execute the following command from the directory where the files are stored:
$ cat dividido* > testUnido.7z
With these small but simple steps we can divide and join files in Linux in a simple and easy way, I hope you like it and see you in a future article.
The content of the article adheres to our principles of editorial ethics. To report an error click here.
Full path to article: From Linux » GNU / Linux » How to split and join files in Linux
How do I split a file into parts in Linux
This tutorial explains how to split files into parts in Linux by size easily, several files, content, and more options. After reading this article, you’ll know how to split files using both the split and csplit commands and how to combine or join file pieces back.
How to split files by size in Linux:
For the first example of this tutorial, I will use a 5GB Windows ISO image named WIN10X64.ISO. To learn the file size you want to split, you can use the du -h command, as shown in the screenshot below.
As you can see, the file size is 5GB. To split it into 5 files of 1GB each, you can use the split command followed by the -b flag and the splitted files size you want. The G defining the size unit for GB can be replaced by M for megabytes or B for bytes.
As you can see, the ISO was splitted into 5 files named xaa, xab, xac, xad, and xae.
By default, the split command names generated files in the previous example, where xaa is the first part, xab the second part, xac the third, etc. As shown in the example below, you can change this and define a name, leaving the default name as an extension.
As you can see, all files are named Windows.* , the extension of the name given by the split command, which allows us to know the order of the files.
When using the split command, you can implement verbosity for the command to print the progress, as shown in the following screenshot.
As you can see, the progress output shows the phase of file division. The next example shows how to split the files into MB units. The file is an 85MB file.
The split command includes additional interesting features which aren’t explained in this tutorial. You can get additional information on the split command at https://man7.org/linux/man-pages/man1/split.1.html.
How to split files by content in Linux using csplit:
In some cases, users may want to split files based on their content. For such situations, the previously explained split command isn’t useful. The alternative to achieve this is the csplit command.
In this tutorial section, you’ll learn how to split a file every time a specific regular expression is found. We will use a book, and we will divide it into chapters.
As you can see in the image below, we have 4 chapters (they were edited to allow you to see the chapter divisions). Let’s say you want each chapter into a different file. For this, the regular expression we’ll use is “Chapter“.
I know there are 4 Chapters in this book, so we need to specify the number of splits we want to prevent errors. In the examples below, I explain how to split without knowing the number of regular expressions or splits. But in this case, we know there are 4 chapters; thus, we need to split the file 3 times.
Run csplit followed by the file you want the split, the regular expression between slashes, and the number of splits between braces, as shown in the example below.
The output we see is the bytes count for each file piece.
As you can see, 5 files were created, the empty space before Chapter 1 was also divided.
The files are named as when using the previously explained split command. Let’s see how they were divided.
The first file, xx00 is empty, it is the empty space before the first time the “Chapter” regular expression appears, and the file gets splitted.
The second piece shows only the first chapter correctly.
The third piece shows chapter 2.
The fourth piece shows chapter three.
And the last piece shows chapter 4.
As explained previously, the number of regular expressions was specified to prevent a wrong result. By default, if we don’t specify the number of splits, csplit will only cut the file one time.
The following example shows the execution of the previous command without specifying the number of splits.
As you can see, only one split and two files were produced because we didn’t specify the number of splits.
Also, if you type a wrong number of splits, for example, 6 splits with only 4 regular expressions, you’ll get an error, and no split will occur, as shown in the example below.
So what to do when the content is too long, and you don’t know how many regular expressions to split you have in the content?. In such a situation, we need to implement the wildcard.
The wildcard will produce many pieces as regular expressions found in the document without the need for you to specify them.
As you can see, the file was splitted properly.
The csplit command includes additional interesting features which aren’t explained in this tutorial. You can get additional information on the split command at https://man7.org/linux/man-pages/man1/csplit.1.html.
How to combine or join files back:
Now you know how to split files based on size or content. The next step is to combine or join files back. An easy task using the cat command.
As you can see below, if we read all file’s pieces using cat and the wildcard, the cat command will order them by the alphabetical order of their names.
As you can see, cats are capable of ordering the files properly. Joining or merging the files consists of exporting this result; you can do it as shown in the example below, where the combinedfile is the name for the combined file.
As you can see in the following picture, the file was properly merged.
Conclusion:
As you can see, splitting files into parts in Linux is pretty easy, and you only need to be aware of what is the proper tool for your task. It is worthwhile for any Linux user to learn these commands and their advantages, for example, when sharing files through an unstable connection or through channels limiting file size. Both tools have many additional features that weren’t explained in this tutorial, and you can read on their man pages.
I hope this tutorial explaining how to split a file into parts in Linux was useful. Keep following this site for more Linux tips and tutorials.
About the author
David Adams
David Adams is a System Admin and writer that is focused on open source technologies, security software, and computer systems.
Как разделить большой файл на части
Иногда может потребоваться разделить большой файл на несколько маленьких частей. Например, если файл настолько большой, что не может быть записан на внешний диск или флешку, так как файловая система не поддерживает файлы такого размера.
Для того, чтобы разделить большой файл на несколько частей можно воспользоваться командой split .
После выполнения команды split , мы получим несколько файлов меньшего размера. Если их объединить, то снова получим исходный большой файл. Объединение файлов выполняется командой cat .
Рассмотрим, как разбить файл на несколько частей и как потом объединить файлы, чтобы получить исходный файл.
Как разделить файл на части
Используем команду split , чтобы разделить файл на несколько более маленьких:
split --bytes=1024M file.mkv file.part.
- file.mkv — имя исходного большого файла, который необходимо разбить на части.
- file.part. — префикс (название) имен файлов, на которые будет разбит исходный файл. То есть в нашем случае мы разобьём исходный файл на файлы file.part.aa , file.part.ab , file.part.ac , .
- —bytes=1024M — задает размер файлов, на которые разбивается исходный файл. В данном случае мы разбиваем исходный большой файл на файлы размером 1024 мегабайта. Для задания размера можно использовать символы:
- K или k — килобайты
- M или m — мегабайты
- G или g — гигабайты
Как объединить файлы в один
После того, как мы разбили файл на части, их можно объединить, чтобы получить исходный файл. Для этого используем команду:
- file.part.* — маска имени файлов кусочков, которые мы объединяем.
- file.mkv — название (путь) до файла, в который мы объединяем наши файлы-кусочки.
Как разбить текстовый файл по строкам
Если вам нужно разделить текстовый файл, на несколько файлов по количеству строк, то мы можем использовать команду split с опцией -l , которая задает количество строк в каждом файле, на которые мы разбиваем исходный файл.
split -l 1000 textfile.txt textfile.part.
Мы разделили исходный текстовый файл, на файлы по 1000 строк в каждом. Объединение файлов выполняется также, как описано в предыдущем параграфе.
Заключение
Мы рассмотрели простейшие способы разделения файла на несколько частей с использованием командной строки.
Для разделения файлов используется команда split . Для объединения файлов мы использовали команду cat .
Чтобы получить более подробную информацию по команде split , выполните в терминале:
Описанный выше способ можно использовать как в Linux, так и в MacOS.