- Split PDF document from command line in Linux?
- How to merge or split PDF files on Linux
- 1 – Using PDFTK
- 2 – Using QPDF tool
- a – QPDF installation
- b- QPDF syntax
- c – Merging files using QPDF
- d- Collation option in QPDF
- e- Specific pages selection
- 2 PDFUNITE tool
- a – Syntax of PDFUNITE
- b – Merging files using PDFUNITE
- 4 – Using PDFSEPARATE
- a – Syntax of PDFSEPARATE
- b – Options of PDFSEPARATE command tool
- b- Splitting a file using PDFSEPARATE
- 3 – Using PDFSAM
- 4 – Conclusion
- amin nahdy
Split PDF document from command line in Linux?
You (should) know that Pdftk is nothing more than a very old version of iText (a Java-PDF library) compiled with GCJ and extended with some command line functionality.
The keywords in the above statement are «VERY OLD».
$ java -classpath /path/to/Multivalent20091027.jar tool.pdf.Split -page 1 input.pdf Exception in thread "main" java.lang.NoClassDefFoundError: tool/pdf/Split Caused by: java.lang.ClassNotFoundException: tool.pdf.Split at java.net.URLClassLoader$1.run(URLClassLoader.java:202) at java.security.AccessController.doPrivileged(Native Method) at java.net.URLClassLoader.findClass(URLClassLoader.java:190) at java.lang.ClassLoader.loadClass(ClassLoader.java:306) at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:301) at java.lang.ClassLoader.loadClass(ClassLoader.java:247) Could not find the main class: tool.pdf.Split. Program will exit.
Turns out, this is a bit of a tricky software: even if it’s on SourceForge, and says here that
Practical Thought generously provides these tools for free use on the command line
The browser is open source. The document tools are a free bonus and not open source.
All releases of Multivalent linked from the official sourceforge site are missing the tools package.
(edit: there seems to be an old Multivalent version with the tools included, see the SO link; but as it looks somewhat like abandonware, I’d rather not use it)
Finally, I’d like to avoid tools that are essentially front ends for LaTeX like pdfjam.
Are there any options for such a PDF splitting command line tool under Linux?
How to merge or split PDF files on Linux
In this short article, you will learn how to merge or split two or more PDF files using command line and GUI based tools. This is suitable for both beginners and experienced Linux users, so let’s get started.
1 – Using PDFTK
PDFTK is a command line tool used to manipulate PDF files. It enables users to carry out several operations on PDF files like splitting, merging, encrypting, decrypting and many more.
For a complete guide on how to install and use PDFTK to merge or split PDF documents on Linux, follow this guide .
2 – Using QPDF tool
QPDF is a lightweight program used to carry out content-preserving and structural transformations on PDF files. It allows to copy objects from one PDF document into another and to handle the list of available pages in a PDF file. This enables the QPDF tool, which has a low dependency on other utilities, to split and merge PDF documents.
Developers of PDF generating applications will find QPDF capabilities very useful indeed. It can also be used to create PDF documents from scratch. QPDF however is not a PDF viewer or a PDF file converter to other formats since it ignores the semantics of PDF files content streams.If you have some issues with your PDF files, you can rely on https://pdf.recoverytoolbox.com/online/ which will recover your corrupted file.
a – QPDF installation
In order to install the lightweight QPDF tool, issue the sudo command below :
b- QPDF syntax
When invoking qpdf, the basic syntax is as follows :
qpdf [ options ] input_filename [ output_filename ]
This will actually convert the PDF file input_filename to the PDF file output_filename. The output document is identical in functionality to the input file though it may have been reorganized structure wise. The options outlined below will control many transformations on the PDF files. The parameter –empty may be provided, in place of input_filename attribute. if you would like to add pages from another file, you could use the –empty switch .
In the command below, qpdf is called with the –empty switch :
Two new pdf files are created separately by each command.
If @filename has been inserted at any position in the command-line, QPDF will read the filename line by line and treat them as a command line argument. The @ switch enables arguments to be retrieved or read from standard input. This will enabe qpdf to be called with any number of long arguments.
If the output_filename argument contains only “-”, it would tell QPDF to write to standard output. If you want to overwrite or replace the input file with the output document, the option –replace-input should be used along with the output file name omitted.
c – Merging files using QPDF
QPDF gives the possibility to merge and split PDF files by choosing pages from one or many input files. Any single input file given is considered as the primary input file and used as the starting point. This file’s pages will be replaced according to the specification in the arguments of the command.
–pages input-file [ –password=password ] [ page-range ] [ … ] —
It is possible to specify multiple input files. Each one is given an optional password if it is password-protected as well as the range of pages. The “–” indicates that the parsing of page selection flags is finished.
In order to merge PDF files into one single file, the following command should be executed
[merge pdf linux command] :
qpdf –empty output_merged.pdf –pages input_file1.pdf input_file2.pdf
Where the files input_file1.pdf and input_file2.pdf will be merged into the PDF document ouput_merged.pdf .
If you wanted to merge all pdf files in the current directory into one single output file, you should run the command below :
qpdf –empty output_file.pdf –pages *.pdf — [pdf merge linux]
d- Collation option in QPDF
When the option –collate is specified, the meaning of the option –pages will change so that the specified input files are collated instead of concatenated as modified by page ranges . For instance, if you add the two files odd.pdf and even.pdf where odd.pdf contains odd pages of a document whereas even.pdf contains the even pages, the command :
qpdf –collate –pages odd_file.pdf even_file.pdf — all.pdf
will collate the pages. The output will result in the picking page 1 from odd_file.pdf then page 1 from even_file.pdf, next would be page 2 from odd_file.pdf and then page 2 from even_file.pdf and so forth until all pages from both files have been included. It is possible to specify any number of files or page ranges. If any file has less pages than others, it will be skipped once all its pages have been included.
e- Specific pages selection
In order now to pick pages 1-7 from an input file named input_file.pdf while all metadata associated with that file is preserved, run the command below :
qpdf input_file.pdf –pages 1-7 — outfile.pdf
If you wanted pages 1 through 5 from infile.pdf but you wanted the rest of the metadata to be dropped, you could instead run
qpdf –empty –pages infile.pdf 1-5 — outfile.pdf
2 PDFUNITE tool
PDFUNITE is a utility that is part of the package poppler-utils, which means that you will get PDFUNITE when you install the package poppler-utils. After the installation is completed, you can immediately start merging your PDF files.
a – Syntax of PDFUNITE
PDFUNITE has a pretty simple syntax :
pdfunite [options] Inputfile1.pdf Inputfile2.pdf .. MergedFile.pdf
Where the files Inputfile1.pdf, Inputfile2.pdf .. are the source files whereas the merged file should be placed at the end of the command line, i.e. MergedFile.pdf .
b – Merging files using PDFUNITE
In order to merge PDF files into one single PDF document, the following command should be used (Ubuntu pdf merge command) :
pdfunite InputFile1.pdf InputFile2.pdf InputFile3.pdf merged_File.pdf
The input files need to belong to the same directory where PDFUNITE is executed. If your PDF files belong to different folders, you would have to provide the absolute path.
4 – Using PDFSEPARATE
Much like its PDFUNITE, PDFSEPARATE is also a unit of the package poppler-utils.
a – Syntax of PDFSEPARATE
The utility PDFSEPARATE has the following syntax:
pdfseparate [options] InputFile.pdf OutputFile_Pattern
PDFSEPARATE reads the input file InputFile.pdf and breaks it up into one or more PDF file OutputFile_Pattern each of which contains one page.
The OutputFile_Pattern should contain the wildcard %d which will be replaced by the page number at the end of the operation. The input file should not be password protected.
b – Options of PDFSEPARATE command tool
There are mainly two options in the PDFSEPARATE utility :
-f number : Indicates the first page to be extracted. If omitted, the extraction will start with the first page or page 1.
-l number : Indicates the last page to be extracted. Extraction ends with the last page if omitted.
b- Splitting a file using PDFSEPARATE
pdfseparate InputFile.pdf InputFile-%d.pdf
Would tell PDFSEPARATE to extract the entire pages from InputFile.pdf to as many files as the number of pages .i.e. if InputFile.pdf has 4 pages, there will be 4 files :
InputFile-1.pdf, InputFile-2.pdf, InputFile-3.pdf and InputFile-4.pdf
3 – Using PDFSAM
Not only command tools can carry out the merging and splitting of PDF files but other GUI based utilities can do the job as well. One of these applications is PDFSAM. It has the possibility to perform many other operations as well like rotating and extracting pages, splitting bookmarks and many others. PDFsam is a Java based tool which is available in most Linux distros. Its GUI is rather intuitive, simple and self-explanatory. For Ubuntu / Debian, you can run the APT command below in order to install PDFsam:
sudo apt-get install pdfsam
Once finished, just invoke the command :
This will being up the popup below which indicates that the application is starting up :
Finally PDFSAM graphical interface will show up as shown below :
Once you click on ‘Merge’ button, the following window will pop up:
Click on the ’Add’ button to select the input PDF files you want to merge. Next scroll down to the ‘Destination file’ section and click the ‘Browse’ button:
You will be able to select a location and a filename for the merged PDF file. Finally Click on ‘Run’ button and you are done !
To split a document, just click on the ‘Split’ button in the main interface :
The principle is the same here as in the previous section but in this case, you would need to choose the file to be broken up using the file browser. Next you would need to select the ‘Split settings’ before selecting the output directory where the splitted files would be generated. Finally click on ‘Run’ button.
4 – Conclusion
You have seen how to merge and split PDF documents using command line based tools like PDFTK, PDFUNITE, PDFSEPARATE and QPDF. For those who do not feel at ease handling commands, they can choose PDFSAM which is a GUI based utility.
If you like the content, we would appreciate your support by buying us a coffee. Thank you so much for your visit and support.
amin nahdy
Amin Nahdy, an aspiring software engineer and a computer geek by nature as well as an avid Ubuntu and open source user. He is interested in information technology especially Linux based ecosystem as well as Windows and MacOS. He loves to share and disseminate knowledge to others in a transparent and responsible way.