Extension:PdfHandler

From MediaWiki.org
Jump to: navigation, search
MediaWiki extensions manual
Crystal Clear action run.png
PdfHandler

Release status: stable

Implementation Media
Description Allows to handle PDF files like multipage DJVU
Author(s) Martin Seidel (xarax)
<jodeldi at gmx dot de>
MediaWiki 1.17+
PHP 5.3+
Database changes No
License GPL
Download
Example usability.wikimedia.org example
j-crew.de example
Parameters
  • $wgPdfProcessor
  • $wgPdfPostProcessor
  • $wgPdfInfo
  • $wgPdftoText
  • $wgPdfOutputExtension
  • $wgPdfHandlerDpi
  • $wgPdfCreateThumbnailsInJobQueue
Hooks used
UploadVerifyFile

Translate the PdfHandler extension if it is available at translatewiki.net

Check usage and version matrix; code metrics
Bugs: list open list all report

The PdfHandler extension shows uploaded pdf files in a multipage preview layout. With the Proofread Page extension enabled, pdfs can be displayed side-by-side with text for transcribing books and other documents, as is commonly done with DjVu files (particularly in Wikisource).

See the BUGS section below for info on how to report problems.

Pre-requisites[edit | edit source]

This extension requires the following packages to be installed first:

Package Description Link
gs-gpl for gs Renders the page images http://www.ghostscript.com
imagemagick dynamic resizing and thumbnailing of images http://www.imagemagick.org/script/install-source.php for instructions on how to install
xpdf-utils for pdfinfo extract metadata from pdf http://www.foolabs.com/xpdf/download.html

The poppler-utils package may be substituted for xpdf-utils on Ubuntu and Debian systems.

Type into a shell "which gs convert pdfinfo" to see if you have the below installed first.

Installation[edit | edit source]

Note Note: The required software must be installed first.

  • Download and extract the file(s) in a directory called PdfHandler in your extensions/ folder.
  • Add the following code at the bottom of your LocalSettings.php:
require_once "$IP/extensions/PdfHandler/PdfHandler.php";
  • Configure as required (see also the examples provided)
  • Done! Navigate to "Special:Version" on your wiki to verify that the extension is successfully installed.

Configuration[edit | edit source]

You can or depending on the operating system of the server will have to set some variables in the "LocalSettings.php" file:

$wgPdfProcessor (default = "gs")
path to your ghostscript implementation
$wgPdfPostProcessor (default = "convert")
path to your imagemagick convert
$wgPdfInfo (default = "pdfinfo")
path to your pdfinfo
$wgPdfOutputExtension (default = "jpg")
extension of the files to be rendered
$wgPdfHandlerDpi (default = "150" )
resolution in dpi
The extension extracts a bitmap image for each page of the PDF, using this resolution (dpi = dots per inch). For example, a PDF page with the European size A4 is 210 mm wide, corresponding to 595 points (at 72 dpi). This yields an image 1240 pixels wide (at 150 dpi). If instead this parameter is set to 300 dpi, the width will be 2480 pixels.
$wgPdfCreateThumbnailsInJobQueue (default = "false")
Puts creating pages' thumbnails into a job queue, so they do not have to be created while browsing a file page, but during normal wikibrowsing. Be advised that setting this to "true" may significantly increase CPU load of your webserver on a high-traffic website. Job queue was designed to perform quick tasks on page views, and creating thumbnails can be included into that queue, according to this definition. Nevertheless, as quick as it is, it also requires certain CPU load to convert PDF pages to another format. The solution to that would be either setting $wgJobRunRate to rather small value (ie. 0.05) or disabling job queue on the high-traffic wiki and setting up another server to do just job queue (like Wikimedia did).

Variables below are not specific to this extension:

  • Enable PDF uploads, if you haven't already: $wgFileExtensions[] = 'pdf';
  • $wgMaxShellMemory - memory limit for gs, convert and pdfinfo. The default value might be too low.

Ubuntu[edit | edit source]

Note Note: This is identical to the default settings for this extension.

$wgPdfProcessor = 'gs';
$wgPdfPostProcessor = $wgImageMagickConvertCommand; // if defined via ImageMagick
// $wgPdfPostProcessor = 'convert'; // if not defined via ImageMagick
$wgPdfInfo = 'pdfinfo';
$wgPdftoText = 'pdftotext';

Debian[edit | edit source]

$wgPdfProcessor = '/usr/bin/gs'; 
$wgPdfPostProcessor = $wgImageMagickConvertCommand; // if defined via ImageMagick
// $wgPdfPostProcessor = '/usr/bin/convert';  // if not defined via ImageMagick
$wgPdfInfo = '/usr/bin/pdfinfo'; 
$wgPdftoText = '/usr/bin/pdftotext';

Windows[edit | edit source]

$wgPdfProcessor = 'C:\Programme\gs\gs8.60\bin\gswin32.exe';
$wgPdfPostProcessor = $wgImageMagickConvertCommand; // if defined via ImageMagick
// $wgPdfPostProcessor = 'C:\Programme\ImageMagick-6.6.2-Q16\convert.exe'; // if not defined via ImageMagick
$wgPdfInfo = 'C:\Programme\xpdf-3.02pl1-win32\pdfinfo.exe';
$wgPdftoText = 'C:\Programme\xpdf-3.02pl1-win32\pdftotext.exe';

Usage[edit | edit source]

  • The main usage of the PdfHandler extension is without user interaction. If you upload a new pdf file, the metadata will be stored in the database, and then this file can be shown in a multipage preview layout like the djvu handler does. Without this extension, pdfs will not display properly when uploaded.
  • Additionally, this extension allows Extension:ProofreadPage to handle pdfs in side-by-side view for transcribing/proofreading, as is done on Wikisource
  • Another option, introduced quite long ago (r25575), is to use it to display PDF files as an image, showing a single page at a time, like so: [[File:myPdfFile.pdf|page=1|600px]]. The page and size parameters are optional; the default page is page #1. Instead of a size-parameter, you can also use the thumb-parameter, with or without captions: [[File:myPdfFile.pdf|page=1|thumb|My PDF]].
  • Because PdfHandler extends ImageHandler, you can use all the arguments that you would for an Image -- for example: thumb, right/left, caption, border, link, etc.
If you would like to present a 2-page pdf, for example, do the following: [[File:myPdfFile.pdf|page=1]] [[File:myPdfFile.pdf|page=2]]

Bugs and enhancements[edit | edit source]

Bugs can be reported at Bugzilla. List of all PdfHandler bug reports.

See also[edit | edit source]