Remove only 1st page from a LOT of pdf files
That’s all I have to do: remove only 1st page from a LOT of pdf files. Please tell me that magic exists.
4 Answers 4
You can do this with a free program called pdftk, available here.
You can use the following commands to take every PDF in the current directory and copy them to the ‘trimmed’ directory with the first page removed:
mkdir trimmed for i in *pdf ; do pdftk "$i" cat 2-end output "trimmed/$i" ; done
This looks like a job for PdfToolKit. This a command line utility to manipulate pdfs
First, install PDFToolkil, either from the Software Centre or using the command line:
sudo apt-get install pdftk
Now the command to remove the first page from a normal (non-protected pdf) would be:
pdftk original.pdf cat 2-end output outputname.pdf
If the pdf is protected you will need to give the passwords to pdftk.
To convert a large number of pdf’s you will need to write a small script that takes care of running pdftk for each one.
You can use pdf-stapler for this task.
for i in *.pdf; do pdf-stapler del "$i" 1 t.pdf && mv t.pdf "$i"; done
I wrote this command line
tree -fai . | grep -P ".pdf$" | xargs -L1 -I <> pdftk <> cat 2-end output <>.truncated.pdf
Does the job, but of course if the file has more than one page, I tested it, it also works with as many levels of folders you have. Just make sure that you run it a the root of the folder structure. Every folder will have for every pdf file an aditional pdf ending with .truncated.pdf
You need pdftk and tree for this and on Ubuntu Linux you can install it with apt:
sudo apt install pdftk tree
How to remove pages from a PDF from your Linux PC
In my opinion, saving the documents in PDF is the best option. A PDF does not take up much space and it is also capable of saving a file with the highest quality, a quality that is maintained even when we enlarge the images. But sometimes a PDF has not been created by us or we have created it, for example, from a web page. What do we do with the information we have left over? Remove pages from a PDF Is the best option.
Removing pages from a PDF is very easy. The problem is that many of us have the preconceived idea that a PDF file cannot be edited; by extension, if it cannot be edited, pages cannot be removed. But this is an idea that we have internalized because this was the case a long time ago. Nowadays, PDFs can be edited and, so far this article is about, deleting pages of information that we do not want to keep.
How to remove pages from a PDF with LibreOffice Draw
LibreOffice Draw It is a very powerful program that surprises both for the possibilities it offers and for its ease of use. Removing pages from a PDF with LibreOffice Draw is so simple that it’s hard to figure out how to do it. I explain it to you below:
- To remove pages from a PDF with LibreOffice Draw we must first open the document with this program. We can do it either from the «File / Open» menu or by right clicking on the document and choosing LibreOffice Draw as the application to open it.
- Once inside you will see something like the following:
- The next step is to mark the page that we want to delete, more specifically its thumbnail in the left panel.
- If we keep the wrong idea that a PDF cannot be deleted, we will never think how easy it is to do it: the «miracle» will happen with just pressing the «Delete» key. You will see that the page disappears.
- Finally, we go to «File / Export to PDF» to save the new document without extra pages. Make no mistake choosing «Save» because LibreOffice it has its own formats and by default it saves it in its own way. You have to «Export».
What did you not imagine that it would be so easy?
With pdftk
As in Linux we have so many options and one of them likes some more than others, we can also do it from the terminal. This requires the pdftk (PDF Tookit) tool. It actually costs more than doing it with LibreOffice Draw, but we will also provide you with information to do something that is easier with pdftk: separate a PDF by pages. To remove pages from a PDF with this tool we will have to do the following:
- We install the tool with the command sudo snap install pdftk o sudo apt install pdf-java.
- I have a PDF that I have created from Firefox called mozilla.pdf. It has 5 pages and I am going to take away the third one. To do this I will open a terminal and write:
pdftk mozilla.pdf cat 1-2 4-end output documento.pdf
- mozilla.pdf is the document I want to edit.
- cat is the order.
- 1-2 y 4-end they are the pages that it will keep or, what is the same, it will eliminate the third because it will keep 1, 2 and 4 until the end (end).
- output tells you that the next will be the new document.
- documento.pdf is the document you will create without page 3.
- Remember that in this and other commands, when the files are being mentioned, it is understood that the full path goes before, such as /home/pablinux/Desktop/mozilla.pdf.
- If after pressing Enter it does not show anything, it is assumed that it is because everything has gone well. We will only see errors if we have forgotten part of the command or if something has failed.
And separate a PDF by single pages?
As we mentioned earlier, pdftk it also allows us separate the whole PDF by pages one by one. Right now I can’t think of any reason why this would be useful, but I’m commenting on it as information in case it works for anyone. The command would be the following, taking into account that «mozilla.pdf» is the document that I want to divide by pages:
As with the previous command, if everything went well it will not display a message after pressing Enter. The only important thing here is to know what it does with the file once it is split: save it in our personal folder (with names pg_0001, pg_0002, pg_003, etc, where «pg» matches the page number) and creates a file with metadata called doc_data.txt in the same path. Among the information stored in this .txt we have the number of pages that were in the original, if we had used a marker, the date of creation and even the program with which it was created and the version of it.
Personally and as I always say, I usually choose the options that allow me to perform all my tasks from one user interface or GUI. But sometimes, using a command line, especially if we are quick to write or create a .dekstop / script, may be a better option, and as an example it is worth separating the pages of a PDF with pdftk. What do you prefer: do it with LibreOffice or a similar program or with tools that are used from the Terminal such as pdftk?
The content of the article adheres to our principles of editorial ethics. To report an error click here.
Full path to article: Linux Addicts » General » How to remove pages from a PDF from your Linux PC
Remove the last page of a pdf file using PDFtk?
You can reference page numbers in reverse order by prefixing them with the letter r. For example, page r1 is the last page of the document, r2 is the next-to-last page of the document, and rend is the first page of the document. You can use this prefix in ranges, too, for example r3-r1 is the last three pages of a PDF.
If you want to remove more than one page, you can change the range, for example 1-r3 does all but the last two pages.
You need to find out the page count, then use this with the pdftk cat function, since (AFAICT) pdftk does not allow one to specify an «offset from last».
A tool like ‘pdfinfo’ from Poppler (http://poppler.freedesktop.org/) can provide this.
Wrapping this in a bit of bash scripting can easily automate this process:
page_count=`pdfinfo "$INFILE" | grep 'Pages:' | awk ''` page_count=$(( $page_count - 1 )) pdftk A="$INFILE" cat A1-$page_count output "$OUTFILE"
Obviously adding parameters, error checking, and what-not also could be placed in said script:
#! /bin/sh ### Path to the PDF Toolkit executable 'pdftk' pdftk='/usr/bin/pdftk' pdfinfo='/usr/bin/pdfinfo' #################################################################### script=`basename "$0"` ### Script help if [ "$1" = "" ] || [ "$1" = "-h" ] || [ "$1" = "--help" ] || [ "$1" = "-?" ] || [ "$1" = "/?" ]; then echo "$script: []" echo " Removes the last page from the PDF, overwriting the source" echo " if no output filename is given" exit 1 fi ### Check we have pdftk available if [ ! -x "$pdftk" ] || [ ! -x "$pdfinfo" ]; then echo "$script: The PDF Toolkit and/or Poppler doesn't seem to be installed" echo " (was looking for the [$pdftk] and [$pdfinfo] executables)" exit 2 fi ### Check our input is OK INFILE="$1" if [ ! -r "$INFILE" ]; then echo "$script: Failed to read [$INFILE]" exit 2 fi OUTFILE="$2" if [ "$OUTFILE" = "" ]; then echo "$script: Will overwrite [$INFILE] if processing is ok" fi timestamp=`date +"%Y%m%d-%H%M%S"` tmpfile="/tmp/$script.$timestamp" page_count=`$pdfinfo "$INFILE" | grep 'Pages:' | awk ''` page_count=$(( $page_count - 1 )) ### Do the deed! $pdftk A="$INFILE" cat A1-$page_count output "$tmpfile" ### Was it good for you? if [ $? -eq 0 ]; then echo "$script: PDF Toolkit says all is good" if [ "$OUTFILE" = "" ]; then echo "$script: Overwriting [$INFILE]" cp -f "$tmpfile" "$INFILE" else echo "$script: Creating [$OUTFILE]" cp -f "$tmpfile" "$OUTFILE" fi fi ### Clean Up if [ -f "$tmpfile" ]; then rm -f "$tmpfile" fi