- How To Split or Extract Particular Pages From A PDF File
- Split Or Extract particular pages from A PDF file using PDFtk
- Install PDFtk on Linux
- Usage
- Extract particular pages from PDF file using default PDF reader application
- Help us to help you:
- How to extract pages from a PDF in Linux
- How to extract PDF pages in Linux via GUI:
- Step 1:
- Step 2:
- Step 3:
- Step 4:
- Step 5:
- Step 6:
- How to extract PDF pages in Linux via terminal:
- Conclusion:
- About the author
- Sam U
- Split PDF document from command line in Linux?
How To Split or Extract Particular Pages From A PDF File
Let us say you have a PDF file with 100 pages and you want to split or extract particular pages from that file. How can you do that? It’s absolutely easy. You don’t need to any premium PDF editing applications. In this tutorial, I will show you a simple way to split or extract particular pages from a PDF file on Linux. Even though there are many methods to accomplish this task, I find the following methods are the easiest way to extract a page range or a part of a PDF file in Linux. Just follow these simple steps described below to get this job done in couple minutes.
Split Or Extract particular pages from A PDF file using PDFtk
I will explain both command line and GUI way. If you use a system that has only CLI mode, follow these steps.
PDFtk is free graphical tool that can be used to split or merge PDF files. You can use it both in CLI and GUI mode. It is available for free and paid.
Install PDFtk on Linux
PDFtk is available in the official repositories of some Linux distributions.
On Arch Linux, PDFtk is available in [community] repository. To install PDFtk on Arch Linux and its variants, run:
Note: You need to enable [universe] repository in Ubuntu to install pdftk.
$ sudo add-apt-repository universe
$ sudo zypper install pdftk
PDFtk is also available as snap package. Make sure your system has snap installed and run the following command to install PDFtk.
Usage
Once you installed PDFtk, open your Terminal and extract a range of pdf files as shown below.
$ pdftk source.pdf cat 5-10 output output_p5-10.pdf
Here, source.pdf is my original PDF file. We extract pages from 5 to 10. Finally we save the output in output_p5-10.pdf file. Very simple, isn’t it? Of course, it is.
If you want split specific pages from the source file, for example 5, 6, and 10, just run:
pdftk source.pdf cat 5 6 10 output output.pdf
The above command will split the pages 5, 6 and 10 from the source.pdf file and save it as output.pdf file.
Extract particular pages from PDF file using default PDF reader application
This is another absolutely easy and handy trick to extract pages from a PDF file using the default PDF viewer application. Most of desktop Linux distributions comes pre-installed with PDF reader application by default. We can use it to extract a particular set of pages from a PDF document.
Here is how I did it in my Arch Linux desktop.
Open the source pdf file using any PDF reader. For the purpose of this tutorial, I use Atril Document Viewer.
Go to File -> Print.
Select Print to file as printer, enter the output filename, select format as PDF, enter the page range (Here 30-40). And finally click Print.
Extract particular pages from PDF file using default PDF reader application
The selected pages will be extracted from the PDF file. That’s it.
Want to know how to merge PDF files? Check the following link,
As you can see, both methods are simple, straight-forward and easy to follow.
Thanks for stopping by!
Help us to help you:
Have a Good day!!
How to extract pages from a PDF in Linux
If you are a keen book reader, it would be quite difficult for you to carry even more than two books. That’s no more the case, thanks to ebooks that save a lot of space in your home and your bag as well. Carrying hundreds of books with you is literally no more a dream.
Ebooks come in different formats, but the common one is PDF. Most of the ebook PDFs have hundreds of pages, and just like real books, with the help of a PDF reader navigating these pages is quite easy.
Suppose you are reading a PDF file and want to extract some specific pages from it and save it as a separate file; how would you do that? Well, it is a cinch! No need to get premium applications and tools to accomplish it.
This guide focuses on extracting a specific part from any PDF file and saving it with a different name in Linux. Though there are multiple ways to do this, I will be focusing on the less cluttered approach. So, let’s begin:
There are two main approaches:
You can follow any method according to your convenience.
How to extract PDF pages in Linux via GUI:
This method is more like a trick for extracting pages from a PDF file. Most of the Linux distributions come with a PDF reader. So, let’s learn a step by step process of extracting pages using the default PDF reader of Ubuntu:\
Step 1:
Simply open your PDF file in the PDF reader. Now click on the menu button and as shown in the following image:
Step 2:
A menu will appear; now click on the “Print” button, a window will come out with print options. You can also use the shortcut keys “ctrl+p” to quickly get this window:
Step 3:
To extract pages in a separate file, click on the “File” option, a window will open, give the file name, and select a location to save it:
I am selecting “Documents” as the destination location:
Step 4:
These three output formats PDF, SVG, and Postscript check PDF:
Step 5:
In the “Range” section, check the “Pages” option and set the range of page numbers you want to extract. I am extracting the first five pages so that I would type “1-5”.
You can also extract any page from the PDF file by typing the page number and separating it by a comma. I am extracting pages number 10 and 11 along with a range for the first five pages.
Note that the page numbers I am typing are according to the PDF reader, not the book. Ensure that you enter the page numbers that the PDF reader indicates.
Step 6:
Once all the settings are done, click on the “Print” button, the file will be saved in the specified location:
How to extract PDF pages in Linux via terminal:
Many Linux users prefer to work with the terminal, but can you extract PDF pages from the terminal? Absolutely! It can be done; all you need a tool to install called PDFtk. To get PDFtk on Debian and Ubuntu, use the command given below:
PDFtk can also be installed through snap:
Now, follow the below-mentioned syntax to use PDFtk tool for extracting pages from a PDF file:
- [sample.pdf] – Replace it with the file name from where you want to extract pages.
- [page_numbers] – Replace it with the range of page numbers, for example, “3-8”.
- [output_file_name.pdf] – Type the name of the output file of extracted pages.
Let’s understand it with an example:
$pdftk adv_bash_scripting.pdf cat 3 — 8 output
In the above command, I am extracting 6 pages (3 – 8) from a file “adv_bash_scripting.pdf” and saving extracted pages by the name of “extracted_adv_bash_scripting.pdf.” The extracted file will be saved in the same directory.
If you need to extract a specific page, then type the page number and separate them by a “space”:
$pdftk adv_bash_scripting.pdf cat 5 9 11 output
In the above command, I am extracting page numbers 5, 9, and 11 and saving them as “extracted_adv_bash_scripting_2”.
Conclusion:
You may occasionally need to extract some specific portion of a PDF file for several purposes. There are many ways to do it. Some are complex, and some are obsolete. This write-up is about how to extract pages from a PDF file in Linux through two simple methods.
The first method is a trick to extract a certain part of a PDF through Ubuntu’s default PDF reader. The second method is via terminal since many geeks prefer it. I used a tool called PDFtk to extract pages from a pdf file through the use of commands. Both methods are simple; you can choose any according to your convenience.
About the author
Sam U
I am a professional graphics designer with over 6 years of experience. Currently doing research in virtual reality, augmented reality and mixed reality.
I hardly watch movies but love to read tech related books and articles.
Split PDF document from command line in Linux?
You (should) know that Pdftk is nothing more than a very old version of iText (a Java-PDF library) compiled with GCJ and extended with some command line functionality.
The keywords in the above statement are «VERY OLD».
$ java -classpath /path/to/Multivalent20091027.jar tool.pdf.Split -page 1 input.pdf Exception in thread "main" java.lang.NoClassDefFoundError: tool/pdf/Split Caused by: java.lang.ClassNotFoundException: tool.pdf.Split at java.net.URLClassLoader$1.run(URLClassLoader.java:202) at java.security.AccessController.doPrivileged(Native Method) at java.net.URLClassLoader.findClass(URLClassLoader.java:190) at java.lang.ClassLoader.loadClass(ClassLoader.java:306) at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:301) at java.lang.ClassLoader.loadClass(ClassLoader.java:247) Could not find the main class: tool.pdf.Split. Program will exit.
Turns out, this is a bit of a tricky software: even if it’s on SourceForge, and says here that
Practical Thought generously provides these tools for free use on the command line
The browser is open source. The document tools are a free bonus and not open source.
All releases of Multivalent linked from the official sourceforge site are missing the tools package.
(edit: there seems to be an old Multivalent version with the tools included, see the SO link; but as it looks somewhat like abandonware, I’d rather not use it)
Finally, I’d like to avoid tools that are essentially front ends for LaTeX like pdfjam.
Are there any options for such a PDF splitting command line tool under Linux?