Convert html to pdf linux

How to convert a HTML file to PDF (with colors)

How can I «export» this local file on my Ubuntu 12.04 to a PDF? (the look and color should stay the same). I tried ex.: Ctrl + P -> print to PDF, but it didn’t preserve the colours. I tried htmldoc with the —color option.. but it’s the same problem.. It would be great to do this via the command line.

@Harley That option is not available on all browsers and all platforms. I’m an Ubuntu 18 and that option doesn’t show up in Chrome or Firefox.

7 Answers 7

Open your html file in LibreOffice Writer and then, under File in the menu, choose export to PDF . That’s it.

The software can be installed using sudo apt-get install wkhtmltopdf .

Webkit HTML to PDF: wkhtmltopdf:

sudo apt-get install wkhtmltopdf 

Here a nixCraft tutorial (updated on 2017).

The latest version is headless (does not require X server).

Another possibility: phantomjs is a magic headless web browser, also based on webkit html. It can export a page as PDF among other things.

WeasyPrint seems promising. I tried wkhtmltopdf and although it renders things in an acceptable way, it doesn’t render everything properly and it creates pdfs that take many seconds to open!

weasyprint mypage.html out.pdf 

As an extra it might be helpful to alter the CSS if you want to get the browser view and PDF to look identical.

/* For converting to PDF */ body < width: 210mm; /* A4 dimension */ >@page

I have just tried to install weasypeasey but didn’t work.. Maybe you could help? 🙂 First I did this: apt-get install python-dev python-pip python-lxml libcairo2 libpango1.0-0 libgdk-pixbuf2.0-0 libffi-dev shared-mime-info and then installing pip with pip install weasypeasy but gets this error: Could not find any downloads that satisfy the requirement weasypeasy

weasyprint is good but about 15 times slower than wkhtmltopdf as I recall, so it was not suitable for us to generate reports on demand for our clients. wkhtmltopdf can be persuaded to do a good job even for complex reports. with some considerable effort!

Источник

Wkhtmltopdf – A Smart Tool to Convert Website HTML Page to PDF in Linux

Wkhtmltopdf is an open source simple and much effective command-line shell utility that enables user to convert any given HTML (Web Page) to PDF document or an image (jpg, png, etc).

Wkhtmltopdf is written in C++ programming language and distributed under GNU/GPL (General Public License). It uses WebKit rendering layout engine to convert HTML pages to PDF document without loosing the quality of the pages. Its is really very useful and trustworthy solution for creating and storing snapshots of web pages in real-time.

Читайте также:  Linux protection and security

Wkhtmltopdf Features

  1. Open source and cross platform.
  2. Convert any HTML web pages to PDF files using WebKit engine.
  3. Options to add headers and footers
  4. Table of Content (TOC) generation option.
  5. Provides batch mode conversions.
  6. Support for PHP or Python via bindings to libwkhtmltox.

In this article we will show you how to install Wkhtmltopdf program under Linux systems using source tarball files.

Install Evince (PDF Viewer)

Let’s install evince (a PDF reader) program for viewing PDF files in Linux systems.

$ sudo yum install evince [RHEL/CentOS and Fedora] $ sudo dnf install evince [On Fedora 22+ versions] $ sudo apt-get install evince [On Debian/Ubuntu systems]

Download Wkhtmltopdf Source File

Download wkhtmltopdf source files for your Linux architecture using Wget command, or you can also download latest versions (current stable series is 0.12.4) at wkhtmltopdf download page.

On 64-bit Linux OS
$ wget https://github.com/wkhtmltopdf/wkhtmltopdf/releases/download/0.12.4/wkhtmltox-0.12.4_linux-generic-amd64.tar.xz
On 32-bit Linux OS
$ wget https://github.com/wkhtmltopdf/wkhtmltopdf/releases/download/0.12.4/wkhtmltox-0.12.4_linux-generic-i386.tar.xz

Install Wkhtmltopdf in Linux

Extract the files to a current working directory using following tar command.

------ On 64-bit Linux OS ------ $ sudo tar -xvf wkhtmltox-0.12.4_linux-generic-amd64.tar.xz ------ On 32-bit Linux OS ------ $ sudo tar -xvzf wkhtmltox-0.12.4_linux-generic-i386.tar.xz

Install the wkhtmltopdf under /usr/bin directory for easy execution of program from any path.

$ sudo cp wkhtmltox/bin/wkhtmltopdf /usr/bin/

How to Use Wkhtmltopdf?

Here we will see how to covert remote HTML pages to PDF files, verify information, view created files using evince program from the GNOME Desktop.

Convert Website HTML Page to PDF File

To convert any website HTML web page to PDF, run the following example command. It will convert the given webpage to 10-Sudo-Configurations.pdf in current working directory.

# wkhtmltopdf https://www.tecmint.com/sudoers-configurations-for-setting-sudo-in-linux/ 10-Sudo-Configurations.pdf
Sample Output :
Loading pages (1/6) Counting pages (2/6) Resolving links (4/6) Loading headers and footers (5/6) Printing pages (6/6) Done

View Generated PDF File

To verify that the file is created, use the following command.

$ file 10-Sudo-Configurations.pdf
Sample Output :
10-Sudo-Configurations.pdf: PDF document, version 1.4

View Information of Generated PDF File

To view the information of generated file, issue the following command.

$ pdfinfo 10-Sudo-Configurations.pdf
Sample Output :
Title: 10 Useful Sudoers Configurations for Setting 'sudo' in Linux Creator: wkhtmltopdf 0.12.4 Producer: Qt 4.8.7 CreationDate: Sat Jan 28 13:02:58 2017 Tagged: no UserProperties: no Suspects: no Form: none JavaScript: no Pages: 13 Encrypted: no Page size: 595 x 842 pts (A4) Page rot: 0 File size: 697827 bytes Optimized: no PDF version: 1.4

View Created PDF File

Take a look at the newly created PDF file using evince program from the desktop.

$ evince 10-Sudo-Configurations.pdf
Sample Screenshot :

Looks pretty nice under my Linux Mint 17 box.

View Website Page in PDF

Create TOC (Table Of Content) of a Page to PDF

To create a table of content for a PDF file, use the option as toc.

$ wkhtmltopdf toc https://www.tecmint.com/sudoers-configurations-for-setting-sudo-in-linux/ 10-Sudo-Configurations.pdf
Sample Output :
Loading pages (1/6) Counting pages (2/6) Loading TOC (3/6) Resolving links (4/6) Loading headers and footers (5/6) Printing pages (6/6) Done

To check the TOC for the created file, again use evince program.

$ evince 10-Sudo-Configurations.pdf
Sample Screenshot :

Take a look at the picture below. it looks even more better than the above.

Читайте также:  Сервер видеонаблюдения линия nvr 32 2u linux

Create Website Page to Table of Contents in PDF

Wkhtmltopdf Options and Usage

For Wkhtmltopdf more usage and options, use the following help command. It will display list of all available options that you can use with it.

Источник

Is there a command-line tool for converting html files to pdf? [duplicate]

I would like to install a command line tool within a Docker image in order to quickly convert *html files into *pdf files. I am surprised there is not a Unix tool to do something like this.

@muru It’s arguable a duplicate, though (A) I’m looking for a command line tool to put in a Docker image and (B) the answers below are quite useful and more helpful that the posting above from 2015. I’ve edited the question to clarify this somewhat, and I’m happy to edit again.

Yes, this question is focused on command line tools while the other isn’t and also, the other requires a more complex solution since it’s about converting multiple, linked html documents. I don’t think it’s a dupe.

6 Answers 6

pandoc is a great command-line tool for file format conversion.

The disadvantage is for PDF output, you’ll need LaTeX. The usage is

pandoc test.html -t latex -o test.pdf 

If you don’t have LaTeX installed, then I recommend htmldoc.

By default, pandoc will use LaTeX to create the PDF, which requires that a LaTeX engine be installed.

Alternatively, pandoc can use ConTeXt, pdfroff, or any of the following HTML/CSS-to-PDF-engines, to create a PDF: wkhtmltopdf, weasyprint or prince. To do this, specify an output file with a .pdf extension, as before, but add the —pdf-engine option or -t context, -t html, or -t ms to the command line (-t html defaults to —pdf-engine=wkhtmltopdf).

+1. pandoc can also use wkhtmltopdf to directly convert from html to pdf, without needing latex. see man pandoc and search for wkhtmltopdf or —pdf-engine

@cas This is really useful. Could you answer the question with that command? I would like to keep this answer

@EB2127 Stack Exchange answers can easily contain more than one solution to a problem; collaborative editing can/should make any answer better.

@cas Unfortunately wkhtmltopdf complains about QXcbConnection: Could not connect to display localhost:12.0 and dumps core. I suspect if I figure out the display issue, then it will work but not sure why it cares about the display.

What advantage is there to using pandoc with the WeasyPrint engine vs just using WeasyPrint without the dependency on pandoc?

You can also try wkhtmltopdf, usage and installation is pretty straightforward.

weasyprint is an option. A possible drawback is that you’ll need python on your machine.

Sure, but there are custom linux systems, on embedded devices for example, that might not have python.

Tried it, but it ignores # in url. e.g. «status.aws.amazon.com/#AP_Block» converts the wrong tab to pdf

I’ve been successfully using the 1.8 branch of HTMLDOC for years. I put it in a commercial system that has since generated hundreds of thousands of reports since 2003.

It’s not super-versatile, but it is very efficient and reliable. It’s limited to a basic set of postscript fonts.

It does not support CSS, but instead uses a special HTML comment directive set to control PDF specific aspects.

The source code is not too difficult to read and edit if you need to add custom facilities, if you’re comfortable with C. It is compiled with GCC or Visual Studio, depending on your target platform.

Читайте также:  Командная строка linux echo

Note that the HTML does not need to be in a file. You can generate it dynamically from a URL, php or aspx etc. You can also hook it up in your web server for generate a PDF file dynamically.

In my use case it generates a PDF file from an asp page which then gets attached to an email, instead of sending the HTML to the printer and the letter stuffing machine; it’s a kind of print spooler.

Источник

How to convert HTML pages to PDF format on Linux

While HTML is an excellent medium for distributing and consuming information on the web, it is not an ideal format as far as printing and archiving purposes are concerned. For that, PDF is a better format, as PDF documents have well-defined page layout, and have all contained images embedded into PDF files. If you would like to convert HTML pages to PDF format on Linux, follow this guideline below.

In Linux environment, you can use a command line utility called wkhtmltopdf to convert any HTML webpage or any live URL to a PDF file. wkhtmltopdf uses Webkit web browser rendering engine to perform HTML page download and HTML-to-PDF conversion.

Install wkhtmltopdf on Linux

On Ubuntu, Debian, Linux Mint:

On Debian-based system, you can install wkhtmltopdf from base repositories:

$ sudo apt-get install wkhtmltopdf

Please be aware that wkhtmltopdf installed via apt-get has reduced functionality and other limitations. First of all, it cannot run without X11 system. Also, it cannot add hyperlinks or a table of contents in the converted PDF file.

On Any Linux:

If you would like to use wkhtmltopdf without X11 system, while enjoying its full features, you can use a static binary of wkhtmltopdf that is distributed from its official website, where pre-build binaries are available for Debian, Ubuntu, CentOS, openSUSE, Arch Linux and others. This version is built with Qt and X11 integrated.

Convert an HTML Page to PDF with wkhtmltopdf

To convert HTML to PDF using wkhtmltopdf , use the following command.

$ wkhtmltopdf http://www.cnn.com cnn.pdf

Note that if you want to capture web pages hosted on https site, you need to install openssl first, and run xkhtmltopdf .

$ sudo apt-get install openssl

If xkhtmltopdf does not work for some reason, an alternative way to convert HTML web pages to PDF files is to use Google Chrome browser. If you don’t have Google Chrome installed, install it first.

On Google Chrome, go to the URL of the web page you would like to convert to PDF. Then, choose Print a page menu of Google Chrome, and change Destination to Save as PDF . Once you click print button, the web page will be saved as a local PDF file that you designate.

Support Xmodulo

This website is made possible by minimal ads and your gracious donation via PayPal or credit card

Please note that this article is published by Xmodulo.com under a Creative Commons Attribution-ShareAlike 3.0 Unported License. If you would like to use the whole or any part of this article, you need to cite this web page at Xmodulo.com as the original source.

Источник

Оцените статью
Adblock
detector