Wkhtmltopdf html to pdf

How to Automate HTML-to-PDF Conversions

In this article, we’ll first introduce the characteristics of HTML documents and PDF documents as the basis. We’ll then propose the necessity and the feasibility of conversions from the HTML document format to the PDF document format. Lastly, we’ll study various command-line tools to realize HTML-to-PDF conversions.

2. HTML vs. PDF Document Format

HTML (HyperText Markup Language) is the code that is used to structure a web page and its content.

PDF, as we know, stands for “portable document format”. A file in PDF format is useful when we need to save files that cannot be modified but still need to be easily shared and printed. Therefore, PDF format allows pages – that is, a fixed layout of text and graphics – to be shared with total fidelity to the author’s intent. The need for a shareable electronic document drove the fundamental design of PDF.

3. Tools Comprehending HTML-to-PDF Conversions

So, how we do go from HTML to PDF? Unless we have Adobe Acrobat or another PDF creation program, it can be hard to convert HTML to PDF. Let’s discuss tools that give us a way to realize HTML-to-PDF conversions.

3.1. wkhtmltopdf

wkhtmltopdf is a simple and effective open-source command-line shell utility that enables users to convert any given HTML (web page) to a PDF document.

Let’s look at the syntax for running wkhtmltopdf with some of its more widely used options:

$ wkhtmltopdf --margin-bottom 20mm --margin-top 20mm --minimum-font-size 16mm . 

The default page size of the rendered document is A4, but by using the –page-size option, this can be changed to almost anything else, such as A3:

$ echo "https://doc.qt.io/archives/qt-4.8/qapplication.html qapplication.pdf" >> cmds $ wkhtmltopdf --page-size --book < cmds

A table of contents can be added to the document by adding a toc object to the command line:

$ wkhtmltopdf toc https://doc.qt.io/archives/qt-4.8/qstring.html qstring.pdf

On Linux, wkhtmltopdf uses the WebKit rendering engine and Qt, which means it can benefit from updates.

3.2. weasyprint

WeasyPrint produces PDFs with selectable text and hyperlinks. The command syntax to obtain a PDF from the HTML file is:

Читайте также:  Меняем цвет шрифта при помощи HTML

The input is a filename or URL to an HTML document, or “-” to read HTML from stdin. The output is a filename, or “-” to write to stdout.

Options can be mixed anywhere before, between, or after the input and output. We can force the input character encoding using -e utf-8 or –encoding utf-8:

$ weasyprint -e utf-8 docs.html docs.pdf

We can also add the filename or URL of a user cascading stylesheet (see Stylesheet Origins ) to the document as -s print.css or –stylesheet print.css:

$ weasyprint -s print.css docs.html docs.pdf

Whereas, the command to set tiny margins is:

$ weasyprint docs.html docs.pdf -s <(echo '@page < margin: 0.5cm; >')

We can install weasyprint using a package manager such as apt-get:

$ sudo apt-get -y install weasyprint

3.3. ebook-convert

The ebook-convert command-line utility converts many HTML documents into a single PDF.

Regular usage of this utility would be:

$ ebook-convert index.html book.pdf

We can also force the input character encoding by using the –input-encoding option to specify the character encoding of the input document.

Another useful option is –max-levels. It permits maximum levels of recursion when following links in HTML files. The value must be non-negative with 5 as default, where 0 implies that no links in the root HTML file are followed.

3.4. unoconv

We can use unoconv in standalone mode, which means that in absence of a LibreOffice listener, it will start its own:

$ unoconv -f pdf some-document.html

Also, we can start unoconv as a listener (by default on localhost:2002) to let other unoconv instances connect to it:

$ unoconv --listener & $ unoconv -f pdf some-document.html $ kill -15 %-

This also works on a remote host:

$ unoconv --listener --server 1.2.3.4 --port 4567

And then, we can connect another system to convert documents:

$ unoconv --server 1.2.3.4 --port 4567 $ unoconv -f pdf mypage.html

We can install it on most Linux flavors via the package manager:

3.5. act Converter

act is a tool that provides a simplified interface for performing common actions. Using this, we can convert an HTML file to a PDF format:

$ act convert index.html -o index.pdf -w 2000px -h 3000px

This will create a new PDF file from the HTML file.

4. Conclusion

In this article, we discussed the underlying characteristics of HTML and PDF document formats. We also discussed the feasibility of HTML-to-PDF conversions. Later, we saw how the use of tools that convert files from HTML to PDF format ease this conversion process, which is otherwise difficult in the absence of a PDF creation program.

Источник

Saved searches

Use saved searches to filter your results more quickly

You signed in with another tab or window. Reload to refresh your session. You signed out in another tab or window. Reload to refresh your session. You switched accounts on another tab or window. Reload to refresh your session.

Читайте также:  Php как определить папку

wkhtmltopdf / wkhtmltopdf Public archive

Convert HTML to PDF using Webkit (QtWebKit)

License

wkhtmltopdf/wkhtmltopdf

This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.

Name already in use

A tag already exists with the provided branch name. Many Git commands accept both tag and branch names, so creating this branch may cause unexpected behavior. Are you sure you want to create this branch?

Sign In Required

Please sign in to use Codespaces.

Launching GitHub Desktop

If nothing happens, download GitHub Desktop and try again.

Launching GitHub Desktop

If nothing happens, download GitHub Desktop and try again.

Launching Xcode

If nothing happens, download Xcode and try again.

Launching Visual Studio Code

Your codespace will open once ready.

There was a problem preparing your codespace, please try again.

Latest commit

* Debian 11 (bullseye) * Ubuntu 22.04 (jammy) * AlmaLinux 9 * AlmaLinux 8 (replaces CentOS 8) * proper ppc64el builds for Debian 10 & Ubuntu 18.04/20.04 0.12.6.1-1 had various issues due to typos and was taken down, and AlmaLinux 9 ppc64le build keeps timing out on GitHub Actions.

Git stats

Files

Failed to load latest commit information.

README.md

wkhtmltopdf and wkhtmltoimage

wkhtmltopdf and wkhtmltoimage are command line tools to render HTML into PDF and various image formats using the QT Webkit rendering engine. These run entirely "headless" and do not require a display or display service.

See https://wkhtmltopdf.org for updated documentation.

wkhtmltopdf has its own dedicated repository for building and packaging.

About

Convert HTML to PDF using Webkit (QtWebKit)

Источник

How to Convert HTML to PDF Using wkhtmltopdf and Python

In this tutorial, you’ll learn how to convert HTML into PDF using wkhtmltopdf, an open source command-line tool that converts HTML to PDF using the Qt WebKit rendering engine. The current version of wkhtmltopdf is 0.12.6, which was released in 2020. It’s available for macOS, Linux, and Windows.

Common use cases for converting HTML to PDF include generating invoices or receipts for sales, printing shipping labels, converting resumes to PDF, and much more.

This tutorial will use Python-PDFKit to convert HTML to PDF, and pdfkit, a simple Python wrapper that allows you to convert HTML to PDF using the wkhtmltopdf utility.

Installing wkhtmltopdf

Before you can use wkhtmltopdf, you need to install it on your operating system.

Читайте также:  Iframe with html code

On macOS

Install wkhtmltopdf using Homebrew:

brew install --cask wkhtmltopdf

On Debian/Ubuntu

Install wkhtmltopdf using APT:

sudo apt-get install wkhtmltopdf

On Windows

Download the latest version of wkhtmltopdf from the wkhtmltopdf website.

After you’ve downloaded the installer, set the path to the wkhtmltopdf binary to your PATH environment variable.

Installing Python-PDFKit

Install Python-PDFKit using Pip:

pip install pdfkit # or pip3 install pdfkit # for Python 3

Python-PDFKit provides several APIs to create a PDF document:

  • From a URL using from_url
  • From a string using from_string
  • From a file using from_file

Creating a PDF from a URL

The from_url method takes two arguments: the URL, and the output path. The following code snippet shows how to convert the Google home page to PDF using pdfkit:

import pdfkit pdfkit.from_url('https://google.com', 'example.pdf')

Place the code snippet in a file named url.py and run it:

python url.py # If you're using Python 3, run the following command: python3 url.py

The output PDF will be saved in the current directory as example.pdf .

Google Home Page PDF

Creating a PDF from a String

The from_string method takes two arguments: the HTML string, and the output path. The following code snippet shows how to do this:

import pdfkit pdfkit.from_string('

Hello World!

'
, 'out.pdf')

Creating a PDF from a String

Creating a PDF from a File

The from_file method takes two arguments: the path to the HTML file, and the output path. The following code snippet shows how to do this:

import pdfkit pdfkit.from_file('index.html', 'index.pdf')

You’ll use an invoice template for the HTML file. You can download the template from here. The following image shows the invoice template.

Invoice HTML to PDF example

It’s also possible to pass some additional parameters — like the page size, orientation, and margins. Add the options parameter to do this:

options = < 'page-size': 'Letter', 'orientation': 'Landscape', 'margin-top': '0.75in', 'margin-right': '0.75in', 'margin-bottom': '0.75in', 'margin-left': '0.75in', 'encoding': "UTF-8", 'custom-header': [ ('Accept-Encoding', 'gzip') ], 'no-outline': None > pdfkit.from_file('index.html', 'index.pdf', options=options)

Conclusion

In this tutorial, you saw how to generate PDFs from HTML using wkhtmltopdf. If you’re looking to add more robust PDF capabilities, PSPDFKit offers a commercial JavaScript PDF library that can easily be integrated into your web application. It comes with 30+ features that let you view, annotate, edit, and sign documents directly in your browser. Out of the box, it has a polished and flexible UI that you can extend or simplify based on your unique use case.

You can also deploy our vanilla JavaScript PDF viewer or use one of our many web framework deployment options like React.js, Angular, and Vue.js. To see a list of all web frameworks, start your free trial. Or, launch our demo to see our viewer in action.

Источник

Оцените статью