Release status: experimental
|Implementation||Special page, Data extraction|
|Description||Converts current page to PDF and sends to browser|
|Author(s)||Thomas Hempel (Thempel Talk)
Christian Neubauer (Cneubauer Talk)
Andreas Hagmann (AhTalk)
Craig Oakes (w1BBoRTalk)
|Latest version||2.5 (2012-06-29)|
|License||No license specified|
|Download||MediaWiki 1.6.7 - 1.16: see here
|Example||Syncleus Wiki Example|
Translate the Pdf Export extension if possible
|Check usage and version matrix; code metrics|
Overview[edit | edit source]
This extension lets you view wiki pages as PDF. It has two modes:
- For any given page, it acts like the SpecialCite.php extension and provides a link in the toolbox to view that page as PDF.
- If you invoke the Pdf Export special page directly, it lets you select a group of wiki pages and output them as a single PDF document. In that view you can also choose orientation (landscape vs portrait) and paper size.
The extension originally worked with the open source htmldoc package. As of version 2.5, it supports a variety of backends including HTMLDoc, DomPdf, MWlib, MPdf, and PrinceXML. It works by rendering the current page without all the navigation stuff and passing that HTML to the backend system for conversion to PDF.
The current version works with recent versions of MediaWiki. The older version of the extension may work with versions as far back as 1.6.7.
Installation[edit | edit source]
Install one of the backends (on Debian based systems such as Ubuntu or Mepis use: apt-get install htmldoc for example). Windows binaries for HTMLDoc can be found here (v1.8.27).
Add the following to your MediaWiki installation's LocalSettings.php.
require_once("$IP/extensions/PdfExport/PdfExport.php"); ## Define only one of the following backends: # PrinceXML $wgPdfExportPrincePath = '/usr/local/bin/prince'; // Path to the PrinceXML binary $wgPdfExportPrincePhpInterface = $IP . '/extensions/PdfExport/prince.php'; // Path to the prince.php file from the prince website. # MWLib $wgPdfExportMwLibPath = '/usr/local/bin/mw-render'; // Path to the mw-render binary # MPdf $wgPdfExportMPdf = $IP . '/extensions/PdfExport/mpdf/mpdf.php'; // Path to the main mPDF.php file # DomPDF $wgPdfExportDomPdfConfigFile = $IP . '/extensions/PdfExport/dompdf/dompdf_config.inc.php'; // Path to the DomPdf config file # HTMLDoc $wgPdfExportHtmlDocPath = '/usr/local/bin/htmldoc';
You may also define a background image which will be printed to every page of the resulting PDF by setting the corresponding constant (note that this only works for the htmldoc backend):
$wgPdfExportBackground = "path/to/the/background-image/image.jpg";
You can also set a variable to control if the PDF opens in the browser window or is downloaded as an attachment. To make the PDF download as an attachment set:
$wgPdfExportAttach = true;
Customization[edit | edit source]
The paper size for the PDF to be created is set with the "MediaWiki:Pdf_size_default" system message. Available sizes are "letter" and "A4".
Version history[edit | edit source]
Todo[edit | edit source]
Fix the special page.Done
- Add a "nopdf" class that can be added to elements to prevent them from showing up in the PDF output.
- Test compatibility with older and newer versions of MediaWiki.
- Did basic testing with mPDF on 1.15.5, 1.16.5, 1.17.5, 1.18.4, and 1.19.1. Each successfully generated a PDF document from the default main page.
- Make sure all the special page options work with all backends (i.e. password protection, font family selection, etc)
- Add the ability to insert a header and footer into the PDF
- Add a global variable to enable or disable the "advanced" options on the special page (like password protection).
- Add the ability to specify page breaks in the PDF output.
See also[edit | edit source]
- Extension:Collection - allows to build collections from a number of pages. Collections can be edited, persisted and retrieved as PDF
- Extension:Pdf Book - composes categories of articles into a book in PDF format, also uses HTMLDOC
- Extension:Pdf Export Dompdf - a modified version of Pdf Export, using dompdf, it will run on shared hosting servers too. The DomPdf support in this extension is based on the work done in that extension.
Benefits/Drawbacks of the Backends[edit | edit source]
HTMLDoc[edit | edit source]
HTMLDoc is very simple to install and use, especially on Linux based systems. It supports only very basic CSS statements though so some layout and style options won't show up in the final PDF. For example, HTMLDoc doesn't support colored links or floated images.
DomPDF[edit | edit source]
DomPdf is a PHP based solution so it's pretty easy to install as well. You just have to download the zip file and unzip it in the extension directory on your server. DomPdf actually uses a couple of different backends for generating PDFs. There is a PHP only solution or a couple of third party libraries that can be used. See the DomPdf documentation for more information. DomPdf supports most CSS rules and can handle things like colors, margins, floating elements, etc. It's not perfect though. It has trouble with wrapping text around floated images for example. There is also a serious flaw in DomPdf where a page with a table that includes rows that span across an entire PDF page cause the tool to go into an infinite loop. This happens sometimes when there is a table within a table and the inner table is really tall.
MWLib[edit | edit source]
MWLib was developed specifically for MediaWiki and is used on Wikipedia to generate PDFs. It does a direct translation of wikitext to PDF. It handles most style and layout options very well. One major issue is it doesn't support colored links. All links show up as black and white.
Also, if you are running Semantic MediaWiki (SMW), your inline queries will not be run by MWLib, resulting in no output where you would be expecting results in tables or lists.
MPdf[edit | edit source]
MPdf seems to be the only backend to support UTF-8 character sets and TrueType fonts. It should be considered if you are using a non-English language wiki. It supports most styles and layouts although in testing, it had problems with floating of tables (in infoboxes for example). It also doesn't handle thumbnailed images perfectly.
PrinceXML[edit | edit source]
PrinceXML is a commercial tool. It handles most (all?) of CSS 2.1 and some of CSS 3.0. It can handle fairly complex style and layout. Like DomPdf, it seems to have trouble displaying floated images with text wrapped around them. Prince requires well-formed XHTML so it requires that tidy be installed.