Extension:Pdf Export

From MediaWiki.org
Jump to: navigation, search
Note: This extension was recently updated (2012-06-29). The new version may not support previous versions of MediaWiki. To find the older version of the extension go here.
MediaWiki extensions manual - list
Crystal Clear action run.png
Pdf Export

Release status: beta

Implementation Special page, Data extraction
Description Converts current page to PDF and sends to browser
Author(s) Thomas Hempel (Thempel Talk)
Christian Neubauer (Cneubauer Talk)
Andreas Hagmann (AhTalk)
Craig Oakes (w1BBoRTalk)
Last version 2.5 (2012-06-29)
Database changes no
License No license specified
Download MediaWiki 1.6.7 - 1.16: see here
MediaWiki 1.15+:
Example Syncleus Wiki Example
Check usage and version matrix; Export stats

Contents

Overview[edit]

This extension lets you view wiki pages as PDF. It has two modes:

  1. For any given page, it acts like the SpecialCite.php extension and provides a link in the toolbox to view that page as PDF.
  2. If you invoke the Pdf Export special page directly, it lets you select a group of wiki pages and output them as a single PDF document. In that view you can also choose orientation (landscape vs portrait) and paper size.

The extension originally worked with the open source htmldoc package. As of version 2.5, it supports a variety of backends including HTMLDoc, DomPdf, MWlib, MPdf, and PrinceXML. It works by rendering the current page without all the navigation stuff and passing that HTML to the backend system for conversion to PDF.

The current version works with recent versions of MediaWiki. The older version of the extension may work with versions as far back as 1.6.7.

Installation[edit]

Install one of the backends (on Debian based systems such as Ubuntu or Mepis use: apt-get install htmldoc for example). Windows binaries for HTMLDoc can be found here (v1.8.27).

Add the following to your MediaWiki installation's LocalSettings.php.

Note: You only need to add one of the five optional backend PDF tools
require_once("$IP/extensions/PdfExport/PdfExport.php");
 
## Define only one of the following backends:

# PrinceXML
$wgPdfExportPrincePath = '/usr/local/bin/prince'; // Path to the PrinceXML binary
$wgPdfExportPrincePhpInterface = $IP . '/extensions/PdfExport/prince.php'; // Path to the prince.php file from the prince website.
 
# MWLib
$wgPdfExportMwLibPath = '/usr/local/bin/mw-render'; // Path to the mw-render binary
 
# MPdf
$wgPdfExportMPdf = $IP . '/extensions/PdfExport/mpdf/mpdf.php'; // Path to the main mPDF.php file
 
# DomPDF
$wgPdfExportDomPdfConfigFile = $IP . '/extensions/PdfExport/dompdf/dompdf_config.inc.php'; // Path to the DomPdf config file
 
# HTMLDoc
$wgPdfExportHtmlDocPath = '/usr/local/bin/htmldoc';


You may also define a background image which will be printed to every page of the resulting PDF by setting the corresponding constant:

$wgPdfExportBackground = "path/to/the/background-image/image.jpg";


You can also set a variable to control if the PDF opens in the browser window or is downloaded as an attachment. To make the PDF download as an attachment set:

$wgPdfExportAttach = true;
The possibility to provide a different filename other than the pagename of the page to be printed requires this parameter to be set to "true".

Customization[edit]

The paper size for the PDF to be created is set with the "MediaWiki:Pdf_size_default" system message. Available sizes are "letter" and "A4".

Version history[edit]

version history

Todo[edit]

  • Fix the special page. Yes check.svg Done
  • Add a "nopdf" class that can be added to elements to prevent them from showing up in the PDF output.
  • Test compatibility with older and newer versions of MediaWiki.
    • Did basic testing with mPDF on 1.15.5, 1.16.5, 1.17.5, 1.18.4, and 1.19.1. Each successfully generated a PDF document from the default main page.
  • Make sure all the special page options work with all backends (i.e. password protection, font family selection, etc)
  • Add the ability to insert a header and footer into the PDF
  • Add a global variable to enable or disable the "advanced" options on the special page (like password protection).
  • Add the ability to specify page breaks in the PDF output.

See also[edit]

  • Extension:Collection - allows to build collections from a number of pages. Collections can be edited, persisted and retrieved as PDF
  • Extension:Pdf Book - composes categories of articles into a book in PDF format, also uses HTMLDOC
  • Extension:Pdf Export Dompdf - a modified version of Pdf Export, using dompdf, it will run on shared hosting servers too. The DomPdf support in this extension is based on the work done in that extension.

Benefits/Drawbacks of the Backends[edit]

HTMLDoc[edit]

HTMLDoc is very simple to install and use, especially on Linux based systems. It supports only very basic CSS statements though so some layout and style options won't show up in the final PDF. For example, HTMLDoc doesn't support colored links or floated images.

DomPDF[edit]

DomPdf is a PHP based solution so it's pretty easy to install as well. You just have to download the zip file and unzip it in the extension directory on your server. DomPdf actually uses a couple of different backends for generating PDFs. There is a PHP only solution or a couple of third party libraries that can be used. See the DomPdf documentation for more information. DomPdf supports most CSS rules and can handle things like colors, margins, floating elements, etc. It's not perfect though. It has trouble with wrapping text around floated images for example. There is also a serious flaw in DomPdf where a page with a table that includes rows that span across an entire PDF page cause the tool to go into an infinite loop. This happens sometimes when there is a table within a table and the inner table is really tall.

MWLib[edit]

MWLib was developed specifically for MediaWiki and is used on Wikipedia to generate PDFs. It does a direct translation of wikitext to PDF. It handles most style and layout options very well. One major issue is it doesn't support colored links. All links show up as black and white.

Also, if you are running Semantic MediaWiki (SMW), your inline queries will not be run by MWLib, resulting in no output where you would be expecting results in tables or lists.

MPdf[edit]

MPdf seems to be the only backend to support UTF-8 character sets and TrueType fonts. It should be considered if you are using a non-English language wiki. It supports most styles and layouts although in testing, it had problems with floating of tables (in infoboxes for example). It also doesn't handle thumbnailed images perfectly.

PrinceXML[edit]

PrinceXML is a commercial tool. It handles most (all?) of CSS 2.1 and some of CSS 3.0. It can handle fairly complex style and layout. Like DomPdf, it seems to have trouble displaying floated images with text wrapped around them. Prince requires well-formed XHTML so it requires that tidy be installed.