Extension:Collection/PDF Writer

From mediawiki.org


mwlib.rl is a python library for writing pdf documents from MediaWiki articles which were parsed by the mwlib library.

See this press release Wikis Go Printable for more information on this project.

No Installation required ![edit]

The PDF Writer can run standalone on a server and provide PDF generation for multiple MediaWiki instances. A server for public testing and low traffic Wikis runs at http://tools.pediapress.com .

All you need is the Collection extension which is configured to use this server by default.

Example[edit]

Solar system, example article from the English language Wikipedia, rendered as PDF using the PediaPress technology.

Technical[edit]

The PDF Writer uses the Python Reportlab libraries to generate PDF based on a DOM derived from parsing mediawiki-markup using the mwlib parser. The Collection Extension can be used to select and manage articles that shall constitute the resulting PDF.

Source[edit]

mwlib.rl is copyrighted by PediaPress and is distributed under a BSD license (see the included README.txt for details).

Install[edit]

Using easy_install[edit]

Make sure, you have the needed environment. On Debian systems:

apt-get install g++ perl python python-dev python-setuptools python-imaging python-lxml libevent-dev

Simply download and install mwlib with easy_install:

easy_install mwlib && rehash && easy_install mwlib.rl

RPM[edit]

RPM based Distros that have yum - just do : yum search mwlib , then do : yum install mwlib

fyi: mwlib has some depedencies which makes it more hard to compile from scratch.

Alternate Installation Instructions (works on Ubuntu)[edit]

The following commands can be used to install mwlib on Ubuntu (http://mwlib.readthedocs.org/en/latest/installation.html)

Run the following as root:

apt-get install -y gcc g++ make python python-dev python-virtualenv libjpeg-dev libz-dev libfreetype6-dev liblcms-dev libxml2-dev libxslt-dev ocaml-nox git-core python-imaging python-lxml texlive-latex-recommended ploticus dvipng imagemagick pdftk

For Ubuntu 16.04.1 (Xenial Xerus) the above command becomes:

apt-get install -y gcc g++ make python python-dev python-virtualenv libjpeg-dev zlib1g-dev libfreetype6-dev libxml2-dev libxslt1-dev ocaml-nox git-core python-imaging python-lxml texlive-latex-recommended ploticus dvipng imagemagick pdftk liblcms2-dev

After that switch to a user account and run:

virtualenv --distribute --no-site-packages ~/pp
export PATH=~/pp/bin:$PATH
hash -r
export PIP_INDEX_URL=http://pypi.pediapress.com/simple/
pip install pyfribidi mwlib mwlib.rl

Install texvc:

git clone https://github.com/pediapress/texvc cd texvc; make; make install PREFIX=~/pp

Custom render server[edit]

For the execution of a custom render server (you have a local mediawiki instance i.e), you need to install mwlib as stated before and then follow the instructions here [1].

According to this person, the following commands must be executed for the server to run:

mw-qserve
nserve.py
nslave.py --cachedir ~/cache/
postman.py

Alternatively you can execute all of these commands in one line: (tested in ubuntu)

nserve & mw-qserve & nslave --cachedir ~/cache/ & postman &

Once the above commands are executed on your render server, and your localsettings.php is correct then you should be able to print to pdf using the collection extension. As another contributor suggested, this may be voodoo, but sometimes if there are errors, restarting the linux server and re-entering these commands helps.

You can put them in a shell script to make the start process easier. If you use these default commands, you'll have a render server listening on 127.0.0.1:8899, but isn't just that simple to figure out configuring your LocalSettings.php if you are running your MediaWiki instance on localhost.

For PDF generation to work, you have to set the following variables in LocalSettings.php:

//Your MediaWiki server (beginning of the file)
$wgServer="http://LAN_IP | PUBLIC_IP | HOST_NAME";
/*
Note:
localhost, 127.0.xx, 192.168.xx won't work as far as I know (from the mailing list) for security reasons
I successfully test with $wgServer="http://10.0.0.110";
*/

//This goes after including Collection extension, usually at file bottom (localhost is also allowed here)
$wgCollectionMWServeURL = 'http://127.0.0.1:8899';

Mailing List[edit]

We have set up a google group for discussion of mwlib.rl. You can subscribe to it via email: mailto:mwlib-subscribe@googlegroups.com.

Help Needed[edit]

Please help us translate some strings used in the generated PDF. The process of internationalisation is done at translatewiki.net. We appreciate your help there.

Programs[edit]

mwlib installs the following programs:

mw-render
generates documents in formats like PDF or ODF from MediaWiki articles
mw-zip
generates ZIP files from MediaWiki articles that contain all information to produce some output document like a PDF file
mw-serve
starts a render server that allows the Collection extension to render documents from article collections

Configuration[edit]

If your MediaWiki has the MediaWiki API enabled, you just specify the base URL of the wiki as the configuration. For example using the English Wikipedia, this

$ mw-render --config http://en.wikipedia.org/w/ --username='xxxx' --password='yyyy' --output test.pdf --writer rl Physics

will produce a PDF document containing the article Physics.

Customization[edit]

It is possible to customize the resulting PDFs - for more information check the README.rst

See also[edit]