How can I automate HTML-to-PDF conversions?

Sorry to unearth this old post, but it came out first in my search for the best HTML/PDF conversion tool. On Linux wkhtmltopdf is very good (takes into account CSS, among others) and GPL.


Update 2019-05

The whole process has thankfully been packed into a docker image by TheCodingMachine: https://github.com/thecodingmachine/gotenberg

This makes maintenance and usage of chrome based pdf generation in production environments really smooth and hassle free.


There is a new headless mode since Chrome 59. As all the other solutions really struggle with newer (or not so new anymore) CSS features like flexbox, this was in my case the only solution to produce a proper PDF output.

To create a pdf from a local html file just use the following command: chrome --headless --disable-gpu --print-to-pdf file:///path/to/myfile.html.

For Mac OS substitue chrome with /Applications/Google\ Chrome.app/Contents/MacOS/Google\ Chrome.

The only downside I noticed so far is that (currently) you can not pass the html via stdin, but creating a temporary file is not that much of an issue.

For more information see https://developers.google.com/web/updates/2017/04/headless-chrome#create_a_pdf_dom

Update: As it turns out, the chrome guys will most likely provide some kind of node module for this task, which would eventually deprecate the headless mode (https://bugs.chromium.org/p/chromium/issues/detail?id=719921).

The best bet would be to use the node based approach using the puppeteer module as documented under https://developers.google.com/web/updates/2017/04/headless-chrome#node and print the page via the Page.printToPDF command, which enables some additional configuration, too.

Of course, you can connect to the debug console websocket from any other environment than node (i.e. PHP script), too.


NOTE: This answer is from 2008 and is probably now incorrect; please check the other answers

PrinceXML is the best one I've seen (it parses regular HTML as well as XML/XHTML). How is it the best? Well, it passes the acid2 test which I thought was pretty darn impressive

It is however, quite expensive


WeasyPrint produces nice PDFs with selectable text and hyperlinks.

weasyprint input.html output.pdf

If you use wkhtmltopdf instead, try the following options:

wkhtmltopdf --margin-bottom 20mm --margin-top 20mm --minimum-font-size 16 ...

Tags:

Linux

Pdf

Perl