# Extension talk:PdfBook

Extension:Pdf Export is very similar to this extension, and so the discussion at Extension talk:Pdf Export may have solutions to problems you're having with this extension.

## PdfBook images fix for htmldoc

At about line 101 in extensions/PdfBook/PdfBook.hooks.php just before

 

 // Write the HTML to a tmp file 

insert the following:

 

 $src_str = 'src="' .$wgServer . '/images/'; // Use this instead if not at the webroot // $src_str = 'src="' .$wgServer . '/' . $wgScriptPath . '/images/';$html = str_replace($src_str, 'src="',$html); 

The <img src .... urls should be relative to images folder for images to display correctly in the pdf generated.

The following is a sample of the CLI command that gets generated and executed in the PdfBook extension:

 

 htmldoc -t pdf --charset iso-8859-1 \ --left 1cm --top 1cm --bottom 1cm --right 1cm --header ... --footer .1. \ --toclevels 3 --headfootsize 8 --quiet --jpeg --color --bodyfont Arial \ --fontsize 8 --fontspacing 1 --linkstyle plain --linkcolor 217A28 \ --no-title --format pdf14 --numbered --firstpage toc \ -f output1.pdf images/pdf-book577686f7d5ca5 

This has been tested in MW v1.19.24 on Debian Jessie and htmldoc v1.18.27.

## PdfBook seems not to be working

You can export a single article as a one-page PDF by setting format=single in the query-string. Example:

When I do this I get the message :

Whereas Main Page has a lot of text. ;-) What am I overlooking? Installed the latest PdfBook (according to Special:Version 1.1.0, 2014-04-01 in MW 1.23.13. But the 1.1.0 comes from PdfBook dated Jan. 9th, 2016 and the corresponding file 'version' says:

[root@node PdfBook]# cat version PdfBook: 17d1dfd8475ac21b81a60c3f82afe58fde9d47bb

2016-01-09T23:07:34

17d1dfd

htmldoc has been installed.

## Missing Images in Https with authentication

--Johnp125 13:52, 11 June 2008 (UTC)

Our SSL doesn't seem to be functional at the moment, but can you check the URL its trying to load the images from? maybe check if exporting as raw html instead of pdf also has problem images? --Nad 07:11, 12 June 2008 (UTC)

--Johnp125 15:25, 12 June 2008 (UTC)

I can get the html file to show the pictures, but it wants to register again with the authentication server, or ie says allow blocked content, which I click on and try and sign on again. The server is not allowing me to sign on again, but the pictures are still showing up.

--Johnp125 14:57, 29 July 2008 (UTC)

I think the problem may be a security issue. Is there a way to generate the data without requesting authentication from the web server? I can get the html version to show the pictures just not the pdf version. If I go to the back door and access the site via http then the pictures show up via pdf.

Sdball 17:36, 6 November 2008 (UTC)

I had the same problem, so I tweaked the extension to:

• use files in /tmp so image references can work
• search the generated html for images
• determine the actual path to the image from their url
• i.e. https://server.com/wiki/images/a/b/image.jpg -> /www/wiki/images/a/b/image.jpg
• use that path to copy the image file to /tmp
• modify the generated html to point to the image file, not the absolute url
• i.e. src=https://server.com/wiki/images/a/b/image.jpg -> src=image.jpg

Feel free to contact me if you'd like the code.

74.143.96.50 20:36, 12 February 2010 (UTC)

Same problem here, different solution. htmldoc uses a unique User-Agent string when hitting the web server. With Apache you can do something like this in your apache config:

SetEnvIf User-Agent ^HTMLDOC let_me_in
.. basic auth stuff ..
require valid-user
Order allow,deny
Allow from env=let_me_in
Satisfy Any


note that this is a significant security hole since anyone hitting your server with that User-Agent string can now get in. You may want to combine (or possibly replace) it with a filter based on the IP address that htmldoc is hitting your server from. If the request always come from 127.0.0.1 for instance, you can Allow from 127.0.0.1 to let it pass. You could also change the htmldoc binary to use your own special useragent string. MedaWiki shouldn't care that the user is anonymous unless you've forced off anonymous access somehow besides the authentication.

• is it possible to specify the page number to start with? This makes sense when you are going to use the exported PDF as appendix to another doc already with n pages.
• is it possible to add a link in the toolbox menu section which is only viewable on categories pages?

# Get i18 file
require_once( 'PdfBook.i18n.php' );

$wgHooks['MonoBookTemplateToolboxEnd'][] = 'fnPDFBookLink'; function fnPDFBookLink( &$monobook )
{
global $wgMessageCache,$wgPdfBookMessages;
foreach( $wgPdfBookMess ages as$lang => $messages ) {$wgMessageCache->addMessages( $messages,$lang );
}
$thispage =$monobook->data['thispage']; // e.g. "Category:Wiki"
$nsnumber =$monobook->data['nsnumber']; // NS 14 is category

if ( $nsnumber == 14 ){ echo "\n\t\t\t\t<li><a href=\"./$thispage?action=pdfbook\">";
$monobook->msg( 'pdf_book_link' ); echo "</a></li>\n"; } return true; }  And add a i18n file named PdfBook.i18n.php with the following contents: <?php$wgPdfBookMessages = array();

$wgPdfBookMessages['de'] = array( 'pdfbook' => 'Pdf-Druck' , 'pdf_book_link' => 'Kategorie als PDF ausgeben' );$wgPdfBookMessages['en'] = array(
'pdfbook' => 'PdfPrint' ,
'pdf_book_link' => 'Print category as PDF'
);
?>

• Does anyone know what the code would be to add the link into the sidebar for the vector skin?
This worked brilliantly for the Monobook skin, but i want to use it on the Vector Skin in the Toolbar. If you have the code please let me know, thanks guys. Nali99.
Found the solutions bascially use the above code exactly but replace 'MonoBookTemplateToolboxEnd' with 'SkinTemplateToolboxEnd' and also replace '$monobook' with$vector. Your code should look like this:
# Create toolbox link
$wgHooks['SkinTemplateToolboxEnd'][] = 'fnPDFBookLink'; function fnPDFBookLink( &$vector )
{
global $wgMessageCache,$wgPdfBookMessages;
foreach( $wgPdfBookMess ages as$lang => $messages ) {$wgMessageCache->addMessages( $messages,$lang );
}
$thispage =$vector->data['thispage']; // e.g. "Category:Wiki"
$nsnumber =$vector->data['nsnumber']; // NS 14 is category

if ( $nsnumber == 14 ){ echo "\n\t\t\t\t<li><a href=\"./$thispage?action=pdfbook\">";
$vector->msg( 'pdf_book_link' ); echo "</a></li>\n"; } return true; }  By Nali_99 ## Mediawiki 1.11.0 Version 0.0.3 didn't work anymore after an upgrade. I made a little fix to PdfBook.php around line 98 of PdfBook.php and it works again.  // while ($row = mysql_fetch_row($result)) { while ($row = $db->fetchRow($result)) {


Disclaimer. I don't know PHP for real, don't know mediawiki, don't know how to program. Just got it by inserting debug statements into PdfBook.php. Looks like mysql_fetch is censored somewhere now ;)

PS: To insert debug statements:

• In LocalSettings.php insert:
$wgDebugLogFile = "/tmp/debug.log"; // file should be writable can be anywhere.  • Anywhere in the code, insert wfDebug (.....);  - Daniel (edutechwiki.unige.ch) Thanks a lot for this, it's still not working for me in 1.11 (I've only just done my 1.11 upgrade), but I've made some changes based on your findings which have got it partially there ;-) --Nad 21:36, 21 September 2007 (UTC) It seems that 1.11 is a bit more memory hungry and my large test books were killing it, after giving PHP 64MB it's working fine now! --Nad 21:41, 21 September 2007 (UTC) ## Empty file downloaded Greetings Nad, I have been trying to use your PDFBook Mediawiki extension since it may be a great solution to an issue I have. I have installed HTMLDoc under "c:\pogram files" and can use it on its own to create PDF Books. I have also included the "PdfBook.php" in my "Local Settings.php" file. The issue I am having is that when I select the link to export my category as a book and select to save or open the pdf file it has 0 bytes. So, the file is created with the correct name but with no data. Is there something else I must do to ensure HTMLDoc.exe is actually being called by your extension? Is there a required directory that it needs to be in? Any help would be appreciated! Thanks! You have to make sure that htmldoc is in your executable PATH so that it can execute from just typing "htmldoc" without needing to supply the full pathname no matter what current directory you're in. Another thing to check would be to comment out the "@unlink($file)" line and after saving a pdf, check if it's left a tmp file in the root of your images directory, which is the data sent to htmldoc. --Nad 00:35, 6 September 2007 (UTC)
I'm experiencing the exact same problem, my files turns up empty. I run the server on a windows machine using Apache. I've installed HTMLDoc and I'm able to create PDF-files using the GUI. If I comment out "@unlink($file)" and then generates the tmp-file through the GUI I'll get my pdf, but all files I download are 0 byte in size... What can be wrong? /Jesper 15:59, 23 October 2007 (UTC) With some hacking of Pdf_Book.php I'm now able to create PDF:s, but only from categories, not from a single page. By commenting out "putenv("HTMLDOC_NOCGI=1");" on line 152 it now generates Category PDF:s. /Jesper 08:09, 25 October 2007 (UTC) I can't even get this far. Did you make any changes other than commenting out that one line? Has anybody else gotten this to work on an Apache Server running on Windows? -Michelle 19:19, 1 May 2008 (UTC) Works for me with WAMP and MediaWiki 1.3. I had to copy libeay32.dll and ssleay32.dll from C:\wamp\Apache2\bin to C:\Program Files\HTMLDOC in order to get HTMLDoc working. I also had to restart Apache to make it refresh the PATH environment variable. Before restart it couldn't find HTMLDoc. I also had to copy the 7.1 C dll (msvcr71.dll/msvcp71.dll) to the HTMLDOC folder. You can find it here: http://support.microsoft.com/kb/326922 Antdos (talk) 10:11, 19 July 2012 (UTC) Make sure your webserver user has write access to /var/tmp. On my setup, htmldoc uses this as a tmp directory. You can diagnose this sort of issue by changing the htmldoc command to something like strace htmldoc >$file.log
For macOS 10.12: HTMLDOC is installed to /usr/local/bin. If you are using the builtin apache server this directory won't be in the PATH, so /usr/local/bin/htmldoc could not be found by the pdf book extension. Follow the steps outlined in https://serverfault.com/a/827046/434690 --Frankhintsch (talk) 10:07, 8 September 2017 (UTC)

to

$cmd = "htmldoc -t pdf --charset iso-8859-1$cmd $file > test.pdf"; Then I get a test.pdf in my mediawiki root folder which works perfectly You could try changing the htmldoc command to use passthru like Extension:Pdf Export - I had it like that on mine but had problems with the gzip encoding, but it may work better like that for you --Nad 21:55, 6 September 2007 (UTC) ## images in the pdf Book? Is there any possibility of getting images displayed in the pdf Book as well?. would be a fantastic improvement. Any workarounds? Martin I'm working on it, I just can't get them to work currently. I'm checking out some of the solutions at Extension talk:Pdf Export too as that one uses htmldoc as well. --Nad 12:39, 12 September 2007 (UTC) Nad, thanks for your great work. I made some fixes to your extension and got it to work correctly with images, even with secure server without modifying .htaccess. The points are: • when generating html output only, links to images could stay absolute as currently. • when generating pdf output, links to images should be converted to relative links to the temp file (pdf-book-something in$IP/images)
• --browserwidth could be a workaround when you have only large images, but would make your small images too small when your image sizes varying a lot. My solution is to rescale large images to fit in the page (pick up image width and height from html output, if they are too big for the paper size, then adjust width="x%", x depending on the ratio width/maxWidth and height/maxHeight.

Hope this helps. Just tell me if you'd want me to send you my codes. Lechau 02:20, 6 June 2008 (UTC)

### A hack

In file PdfBook.hooks.php around line 101 (I may have inserted other stuff) just before "#write the HTML to a tmp file" insert this:

$ori_string = 'src="';$repl_string = 'src="' . $wgServer;$html = str_replace ($ori_string,$repl_string, $html);  # Write the HTML to a tmp file  The problem is that the intermediary output file got stuff like this:  src="/mediawiki/images/thumb/pict.png  but you want: httpee://your.server.org/mediawiki/images/thumb/pict.png  This is not the best solution, a regexp hacker should actually rip away most of the html picture markup and then replace the thumb by the original pic maybe. But above is at least a minimal job. To see the intermediary file as someone said, comment the unlink at the end and the get it from the images file.  //@unlink($file);


Sorry, I'm not a real programmer and have too much workload to help for real. Just wanted to produce some handouts ;) - Daniel

### only border and image link is displayed (mw 1.16.4, PHP 5.2.17 (cgi-fcgi))

I did not find the often mentioned ./images folder. Only the images folder in the wiki root. Any ideas?

## Same problem as section 2

I'm on Ubuntu Linux with Mediawiki 1.10. Htmldoc is in /usr/bin. I commented out the unlink command, and the temp file is empty (0 length).

I checked to be sure that my Apache user can run htmldoc -- it can. Unsure what I should try next.

By the way, your single-page export plugin works perfectly (even for images). So I know that htmldoc is not at fault here.

I didn't write the single page one, but the code seems pretty similar. I'll just have to see what differences there is in the code between this one and the single-page one. --Nad 22:28, 14 September 2007 (UTC)

What happens when pdf is not a valid file type when uploading? Does the wiki control this with this extension, if so do I need to add pdf file types to the type of files you can upload?

The upload filetype is unrelated to this since exported pdf's are downloaded not uploaded. If you want to add pdf to your allowed upload filetypes, use $wgFileExtensions[] = 'pdf', you may also want to set$wgVerifyMimeType to false if it's giving you hassles when you try and upload exotic types of file. --Nad 04:11, 21 September 2007 (UTC)

--Johnp125 02:12, 25 September 2007 (UTC)

Sorry to be such a pain. I have setup a test wiki which is running fedora --Johnp125 00:23, 27 September 2007 (UTC)c 4. Please check out my test wiki and see if you can give me some direction. I have debug for the wiki in localsettings.php on. If you need admin access please email me at johnp125@yahoo.com and I'll hook you up.

The output shows a bug due to 1.11 being more strict about hook return values. Try again now with the latest version, 0.0.4. Also note that even if it works, you will get just an empty document since the point of this extension is to compose a book from the content of a category, if it not placed in a category or the category contains no members then the result will be empty. To export the content of a single page you should be using Extension:Pdf Export. --Nad 03:33, 25 September 2007 (UTC)
However, I'm working on version 0.5 now which can be used in non-category pages and will compose the book from the article links found in the page, so that books can then be composed from explicit lists or DPL queries. --Nad 03:33, 25 September 2007 (UTC)

--Johnp125 13:28, 25 September 2007 (UTC)

http://wikitest.homelinux.net/wiki2/index.php?title=Category:test&action=pdfbook this one should be going after the demo page with the catageory:test and then creating a pdf book from that. Is this not the right way to use the code? I know if I created more pagese and put the catageory:test under them they would get put into the pdf file as well.

You had a typo in the word "category", link is working now ;-) --Nad 22:21, 25 September 2007 (UTC)

--Johnp125 17:30, 26 September 2007 (UTC)

Thanks a bunch. Your the greatest. Glad to have this working now.

Checked out your info about Images not showing in mediawiki 1.10.2---1.11. Nice work.

I just did another update yesterday which has images working now --Nad 21:06, 26 September 2007 (UTC)

--Johnp125 00:16, 27 September 2007 (UTC)

Is this the update that is going to work with DPL queries? I started to play around with that extension. I know it's working but right now it's too big to try and figure out.

--Johnp125 00:23, 27 September 2007 (UTC)

Hey by the way could you tell me how to make the pdfbook extension just make a big html file, so I could open it in word or openoffice in html format and let the office program convert it from the html file? Or is it easier to say and harder to do?

That feature is very easy to add because it simply requires not sending the file to HTMLDOC, I've added an option in a new version (0.0.7) which allows you to do this by adding format=html to the query-string. --Nad 02:06, 27 September 2007 (UTC)

--Johnp125 22:04, 30 September 2007 (UTC)

Wow that sounds great can't wait to try out the html export. I looked for the 0.0.7 version but only saw the 0.0.6 version when I went to the download section. Also could you give me a example of how the format=html is used.

Where would it go in this string?

Sorry about that I must have forgotten to update it, it's at 0.0.7 now. To change the URL above to produce html, append &format=html to it. We use a template which has a link for both, see OrganicDesign:Template:Book. --Nad 07:11, 1 October 2007 (UTC)

--Johnp125 01:55, 2 October 2007 (UTC)

The html export looks really good. I Did notice on small html files Microsoft word gets confused about it. Maybe if you put the html header info at the top and bottom of page to help microsoft word out. Openoffice did not seem to have a problem with it. However word is looking for the html tags on small exports. If it's a big export it gets the idea.

--Johnp125 02:08, 2 October 2007 (UTC)

Just tested it again with a small html download. Word tried to format it when opening. Then I added the <html> to the beginning and then added the </html> at the end. Then reopened the file with word and bingo it worked fine. Maybe something to add in 0.0.8? Openoffice worked either way.

Keep up the good work. This is the best extension for wiki out there right now.

If you have larger text, don't forget to change server settings. E.g. for a 2000 page document produced with a low-end 2CPU sparc box I use this in php.ini:

max_execution_time = 600
max_input_time = 600
memory_limit = 100M


and this in http.conf:

Timeout 600


Else you just get a blank page without any warning or error message - Daniel K. Schneider 11:00, 20 June 2008 (UTC)

## Hacks to change PDF output (v. 0.6)

• Images: If they don't fit your PDF page, you have to set pixel width of a virtual browser page (that's a "feature" of htmldoc). By default it is 680 pixels only and images larger than that will be rendered larger than your PDF page! Lots of my pictures are...
• Titlepage: If you want a standardized titlepage before the TOC, create it in HTML and put it somewhere in your file system. I just put it in the images directory.

Then change PdfBook.php like this for example:

$cmdext = " --browserwidth 1000 --titlefile$wgUploadDirectory/PDFBook.html";
$cmd = "htmldoc -t pdf --charset iso-8859-1$cmd $cmdext$file";


Basically, I found it a good idea to read the htmldoc manual. In my Unix system it sits in /usr/local/share/doc/htmldoc/htmldoc.pdf. (see chapter 8). Made other changes too.

Now of course Nad may at some point add some more options, but changing a line in the php file does it too :) - Daniel (edutechwiki.unige.ch)

$repl_string = 'id="toc" style="visibility: collapse;"';$html = str_replace ($ori_string,$repl_string, $html);  After "# If format=html in query-string, return html content directly" the TOC disappears in the HTML file, but I can't get the same thing to work with the PDF. //Jesper 85.89.79.106 07:00, 1 November 2007 (UTC) Good point, it's not useful to have TOC when it's a book which already has a TOC - I've updated it to add a __NOTOC__ before parsing each article --Nad 07:58, 1 November 2007 (UTC) Ah, Thanks Nad! That was a fast reply and I really appreciate it! //Jesper 85.89.79.106 08:31, 1 November 2007 (UTC) ## no index pages --Johnp125 16:59, 8 November 2007 (UTC) Is there anyway to run the query and not create any autogenerated index pages or put the index number in the text? --Johnp125 18:26, 8 November 2007 (UTC) ok just checked out the new html version .9. This does what I would like it to do. Images work and everything. I was having problems with the images because we have a alias for the wiki /wiki/index.php when you run the pdfbook to pdf format I think it cannot find the /wiki/picture.jpg instead of /picture.jpg, anyway the new html version works just fine. ## Header info --Johnp125 18:31, 8 November 2007 (UTC) I know this question is off on a limb but, is there anyway I could select certain Headline text from not being pulled based on the name like Image Header? ## Missing end tag in 0.0.9 source code Just for the record: it seems that the page at Organic Design which lists the v0.0.9 source code is missing a php end tag at the bottom of the file. Cheers, Lexw 09:23, 13 November 2007 (UTC) End delimiters are removed to avoid whitespace being sent to the output - unfortunately I can't find the link to the official bug report about it. --Nad 19:59, 13 November 2007 (UTC) ## Additional functionality in PdfBook Hi Nad, I have added some additional functionality into PdfBook that you might be interested in for a next version. Seems that you have switched off email (which I can understand), so I couldn't contact you that way. Please contact me by email via 'E-mail this user' if you are interested. Other users: please don't contact me. I might come back to this topic later, first I want to discuss things with Nad. Regards, Lexw 13:39, 15 November 2007 (UTC) ## Added recursive follow functionality Hi Nad, I'm using your PdfBook Extension and I've added some functionality to recursively follow links to produce a PDF. With the parameter follow=deep or follow=broad the created PDF will contain all pages that are referenced from the current page, and recursively all further referenced pages, in a depth-first or breath-first manner. Here are the relevant code snippets:  if ($title->getNamespace() == NS_CATEGORY) {
$cat =$title->getDBkey();
$db = &wfGetDB(DB_SLAVE);$cl     = $db->tableName('categorylinks');$result = $db->query("SELECT cl_from FROM$cl WHERE cl_to = '$cat' ORDER BY cl_sortkey"); if ($result instanceof ResultWrapper) $result =$result->result;
while ($row =$db->fetchRow($result))$articles[] = Title::newFromID($row[0]); } else if (isset($_REQUEST['follow'])) {
$deep =$_REQUEST['follow'] == 'deep';
wfDebug("PdfBook: following links - " . ($deep ? "depth first\n" : "breadth first\n"));$articles[] = $title; wfDebug("PdfBook: adding page '" .$title->getText() . "'\n");
$this->getLinkedArticles($articles,$article,$opt,$deep); } else {$text = $article->fetchContent();$text = $wgParser->preprocess($text,$title,$opt);
if (preg_match_all('/^\\*\\s*\${2}\\s*([^\\|\$]+)\\s*.*?\\]{2}/m',$text,$links))
foreach ($links[1] as$link) $articles[] = Title::newFromText($link);
}

	function getLinkedArticles(&$articles,$article,$opt,$deep) {
global $wgParser;$text = $article->fetchContent();$text = $wgParser->preprocess($text,$article->getTitle(),$opt);
$linktitles = array(); wfDebug("PdfBook: ----- processing article '" .$article->getTitle()->getText() . "' ($deep)\n"); if (preg_match_all('/\${2}\\s*([^\\|\$]+)\\s*.*?\\]{2}/m',$text,$links)) { foreach ($links[1] as $link) {$linktitles[] = Title::newFromText($link); wfDebug("PdfBook: found link '" .$link . "'\n");
}
}
wfDebug("PdfBook: processing " . count($linktitles) . " links...\n"); if ($deep) {
foreach ($linktitles as$linktitle) {
$exists = false; foreach ($articles as $el) { if ($el->getText() == $linktitle->getText())$exists = true;
}
if (!$exists) { wfDebug("PdfBook: adding '" .$linktitle->getPrefixedText() . "'\n");
$articles[] =$linktitle;
$art = new Article($linktitle);
$this->getLinkedArticles($articles,$art,$opt,$deep); wfDebug("----- <\n"); } } } else {$newlinktitles = array();
foreach ($linktitles as$linktitle) {
$exists = false; foreach ($articles as $el) { if ($el->getText() == $linktitle->getText())$exists = true;
}
if (!$exists) { wfDebug("PdfBook: adding '" .$linktitle->getText() . "'\n");
$articles[] =$linktitle;
$newlinktitles[] =$linktitle;
}
}
foreach ($newlinktitles as$linktitle) {
wfDebug("PdfBook: adding subpages of '" . $linktitle->getText() . "'\n");$art = new Article($linktitle);$this->getLinkedArticles($articles,$art,$opt,$deep);
}
}
}


I can also send you the complete file if you want. Tbleier 2008-01-25

In order to have a proper title page on the generated PDF, I've added a few lines of code that read a plain HTML file and replace some placeholders with values like "Category name", etc... and then use that file with htmldoc's otherwise static "--titlefile" option.

Additionally, I've added 2 new variables: $wgPdfBookTitleFile and$wgPdfBookLogoImage so one can easily select a title page and logo image (to display at the bottom of a page).

I'll make a small package and put it on some webserver instead of posting the code here (too messy already). :) The rooker 14:00, 20 February 2008 (UTC)

That is exactly what I have done and wanted to discuss with Nad (see above), but he doesn't seem to react. I've gone a little further and now create the titlefile dynamically from the PdfBook extension, so there is no more external HTML file necessary for generating the title page. A logo file was included in my implementation too (only I added it to the header, not the footer, but that's a matter of configuration which can be overruled in the general wiki LocalSettings.php).
Since this implementation is not part of the "official" PdfBook extension, I will have to find a place to store it, if anyone is interested. Rooker, have you already stored your solution somewhere? Lexw 09:27, 8 April 2008 (UTC)
@Lexw: I've provided a quickly cleaned version including my modifications. See the "README.txt" inside for details: PdfBook-0.0.9-DynamicTitle.tar.bz2 The rooker 10:57, 17 April 2008 (UTC)
Thankyou for this i am using this part of the code. Anyone got any ideas about how i can include headers and footers on everypage of the pdf? --194.169.24.100 16:48, 19 June 2009 (UTC)

## PHP compilation error

Hello,

I'm trying to install version 0.0.9 on a Red Hat Entreprise Linux ES 4 on which a mediawiki 1.6.8 is running with php 4.3.9. php-book.php has been copied into the "extensions" directory, then include vi LocalSetting :

require_once( "extensions/pdf-book.php" );


and we have this error :

Parse error: parse error, unexpected T_OBJECT_OPERATOR in /var/wwwwikitn/html/mediawiki-1.6.8/extensions/pdf-book.php on line 66


which is

$msg =$wgUser->getUserPage()->getPrefixedText().' exported as a PDF book';


Any idea ? Thanks !

Just a wild guess: PHP5 needed? Lexw 07:49, 17 April 2008 (UTC)

## Problem in pdfbook if only current page should be converted

I had the problem if I use

no PDF was produced because temporary html file was empty.

I had to add the following line to the else block of if ($title->getNamespace() == NS_CATEGORY) { $articles[] = $title;  Now it works. The new code looks like this  if ($title->getNamespace() == NS_CATEGORY) {
$cat =$title->getDBkey() ;
$db = &wfGetDB(DB_SLAVE);$cl     = $db->tableName('categorylinks');$result = $db->query("SELECT cl_from FROM$cl WHERE cl_to = '$cat' ORDER BY cl_sortkey"); if ($result instanceof ResultWrapper) $result =$result->result;
while ($row =$db->fetchRow($result))$articles[] = Title::newFromID($row[0]); } else {$text = $article->fetchContent();$text = $wgParser->preprocess($text,$title,$opt);
if (preg_match_all('/^\\*\\s*\${2}\\s*([^\\|\$]+)\\s*.*?\\]{2}/m',$text,$links))
foreach ($links[1] as$link) $articles[] = Title::newFromText($link);
$articles[] =$title;
}


--Guenterg 11:31, 28 March 2008 (UTC)

I had the same problem, your fix works for me too. Thanks a lot, Guenter!
Now I still have to find a way how to avoid that the PdfBook template itself is included in the PDF document if I place that template on an article, not on a category. But that's a different matter... Lexw 09:16, 8 April 2008 (UTC)

-- This modification works pretty good. It produces a funny numbering scheme for the page index. RHEL 5 / PHP 5.1.6 / LAMP / MW 1.12

## Whole Namespace Export

The tweak below will allow the extract of a whole NameSpace e.g. "Talk" through the additional action "nspdfbook" eg.

 http://localhost/wiki/index.php?title=Talk:Main_Page&action=nspdfbook


Note:

• You may have to up the "; Resource Limits ;" in your php.ini, if you use the mod to export all "Articles".
• May wish to Alter the Order by, to sort on page name rather than id.
	public static function onUnknownAction( $action,$article ) {
global $wgOut,$wgUser, $wgParser,$wgRequest;
global $wgServer,$wgArticlePath, $wgScriptPath,$wgUploadPath, $wgUploadDirectory,$wgScript;

if( $action == 'pdfbook' ||$action == 'nspdfbook' ) {

$title =$article->getTitle();
$opt = ParserOptions::newFromUser($wgUser );

// Log the export
$msg = wfMsg( 'pdfbook-log',$wgUser->getUserPage()->getPrefixedText() );
$log = new LogPage( 'pdf', false );$log->addEntry( 'book', $article->getTitle(),$msg );

// Initialise PDF variables
$format =$wgRequest->getText( 'format' );
$notitle =$wgRequest->getText( 'notitle' );
$layout =$format == 'single' ? '--webpage' : '--firstpage toc';
$charset = self::setProperty( 'Charset', 'iso-8859-1' );$left    = self::setProperty( 'LeftMargin',  '1cm' );
$right = self::setProperty( 'RightMargin', '1cm' );$top     = self::setProperty( 'TopMargin',   '1cm' );
$bottom = self::setProperty( 'BottomMargin','1cm' );$font    = self::setProperty( 'Font',	     'Arial' );
$size = self::setProperty( 'FontSize', '8' );$ls      = self::setProperty( 'LineSpacing', 1 );
$linkcol = self::setProperty( 'LinkColour', '217A28' );$levels  = self::setProperty( 'TocLevels',   '2' );
$exclude = self::setProperty( 'Exclude', array() );$width   = self::setProperty( 'Width',       '' );
$width =$width ? "--browserwidth $width" : ''; if( !is_array($exclude ) ) $exclude = split( '\\s*,\\s*',$exclude );

// Select articles from members if a category or links in content if not
if( $format == 'single' )$articles = array( $title ); else {$articles = array();
if( $title->getNamespace() == NS_CATEGORY ) {$db     = wfGetDB( DB_SLAVE );
$cat =$db->addQuotes( $title->getDBkey() );$result = $db->select( 'categorylinks', 'cl_from', "cl_to =$cat",
'PdfBook',
array( 'ORDER BY' => 'cl_sortkey' )
);
if( $result instanceof ResultWrapper )$result = $result->result; while ($row = $db->fetchRow($result ) ) $articles[] = Title::newFromID($row[0] );
}
else { if ($action == 'nspdfbook') {$db     = &wfGetDB(DB_SLAVE);
$pl =$db->tableName('page');
$ns =$title->getNamespace();
$result =$db->query("SELECT page_id FROM $pl WHERE page_namespace =$ns ORDER BY page_id");
if ($result instanceof ResultWrapper)$result = $result->result; while ($row = $db->fetchRow($result)) $articles[] = Title::newFromID($row[0]);
$book = "PDFBook_Namespace_Export-".MWNamespace::getCanonicalName($ns);
}
else {
$text =$article->fetchContent();
$text =$wgParser->preprocess( $text,$title, $opt ); if ( preg_match_all( "/^\\*\\s*\${2}\\s*([^\\|\$]+)\\s*.*?\\]{2}/m",$text, $links ) ) foreach ($links[1] as $link )$articles[] = Title::newFromText( $link ); } } } // Format the article(s) as a single HTML document with absolute URL's$book = $title->getText();$html = '';


--Andy 13:55, 11 April 2008 (UTC)

I've updated this code to work (maybe) with newer versions. —Emufarmers(T|C) 09:36, 20 February 2013 (UTC)

## SpecialVersion Issue and PHP 5.1.4

After installing the PdfBook extension and displaying the version page I get:

 Notice: Object class PdfBook could not be converted to int in ....\SpecialVersion.php on line 275.

   The line in Specialversion.php is "sort ($list);"  I found a general discussion at http://www.webmasterworld.com/php/3586902.htm that talk about 5.1.4 vs 5.2.4. Any thoughts on how to make PdfBook work in php 5.1.4? I am using Mediawiki 1.12.0 and php 5.1.4. ## Link in to Hierarchy Any ideas on how best to link into the Hierarchy extension? I think this would be very useful because the hierarchy is setup perfect for printing a book. I haven't quite figured out how to set this up though. You would have to use the extensions "hierarchy" table to pull information about where you are on the hierarchy, and what subordinate pages you would have to print. I think it would be nice to print where you are down, like you are on chapter 1, so it only prints chapter 1, but if you are at the title page it will print the whole book. It might also be nice to be able to setup a list of pages and then print that list in order. I am going to do what I can, but I am pretty new to PHP, and any advice is welcome. --Greg 16:04, 1 May 2008 (UTC) ## Exclusions It would also be nice to have exclusion meta tags where you can specify what parts are included in the book and what parts are not (so if you have a header/footer you don't have to include that in the book) --Greg 16:07, 1 May 2008 (UTC) I have also run into this problem, wanting to include only one section of a page using the [[PageName#Section]] markup to get just that section as part of the composite print. This would be a great feature. --Abby621 14:04, 4 June 2008 (UTC) After looking through the code, I discovered you can accomplish exclusions by placing parts of the article you wish not to include inside of <div> tags (example: <div class= "noprint"> exclude this section </div>) ## latest version gives syntax error  Parse error: syntax error, unexpected '}' in Pdf_Book.php on line 49  Is this expected? FWIW, I'm using Ubuntu Hardy Heron with PHP 5.2.4-2ubuntu5.1 with Suhosin-Patch 0.9.6.2 Swaroopch 21:20, 21 June 2008 (UTC) Sorry about that, fixed --Nad 21:59, 21 June 2008 (UTC) ## "??????" instead of russian letters We have all "?" sings instead of russian letters. Encoding in browser is UTF-8. • Change default charset from iso-8859-1 to cp-1251,$charset = $this->setProperty('Charset','cp-1251'); • Replace php function utf8_decode by other function, what can convert utf8 to cp1251; sample, 89 line of file PdfBook.hooks.php:$html .= iconv("utf-8", "windows-1251", "$h1$text\n"); //utf8_decode( "$h1$text\n" );
• If no text in pdf displayed, replace fonts used by htmldoc (/usr/share/htmldoc/fonts) by fonts with cyrillic support.

--Rius 16:15, 19 June 2009 (UTC)

## How would you modify the script to include the last date and time edited for each article?

I'm not a PHP wiz and am wondering what would be involved to output the last edit date/time for each article? Preferably, I would like to see this info directly under the article title. Any help would be excellent. Great extension! --Paul

## No images

• I still can't get images in. The image is in the PDF file, and links back to my wiki image, but the picture simply doesn't appear. Help?
• Also, title page is empty. How to fill it?

Here is my Template: Template:Pdf_book

[[Image:15x18-fileicon-pdf.png]][{{fullurl:{{FULLPAGENAMEE}}|action=pdfbook}} Create a PDF Book]]


## Updated bibtex_fields.php

Here is an updated bibtex_fields.php with complete Bibtex Entries and Fields.

bibtex_fields.php

<?php
//taken from http://en.wikipedia.org/wiki/BibTeX
//this file in only used in the creation of a new reference as a template.

$bibtex_fields["article"][]="author"; //mandatory$bibtex_fields["article"][]="title";
$bibtex_fields["article"][]="journal";$bibtex_fields["article"][]="year";
$bibtex_fields["article"][]="volume";$bibtex_fields["article"][]="number";
$bibtex_fields["article"][]="pages";$bibtex_fields["article"][]="month";
$bibtex_fields["article"][]="note";$bibtex_fields["article"][]="key";
$bibtex_fields["article"][]="url";$bibtex_fields["article"][]="keywords";
$bibtex_fields["article"][]="abstract";$bibtex_fields["book"][]="author"; //mandatory
$bibtex_fields["book"][]="editor";$bibtex_fields["book"][]="title"; //mandatory
$bibtex_fields["book"][]="publisher";$bibtex_fields["book"][]="year"; //mandatory
$bibtex_fields["book"][]="volume";$bibtex_fields["book"][]="number";
$bibtex_fields["book"][]="series";$bibtex_fields["book"][]="address";
$bibtex_fields["book"][]="edition";$bibtex_fields["book"][]="month";
$bibtex_fields["book"][]="note";$bibtex_fields["book"][]="key";
$bibtex_fields["book"][]="url";$bibtex_fields["book"][]="keywords";
$bibtex_fields["book"][]="abstract";$bibtex_fields["conference"][]="author";
$bibtex_fields["conference"][]="title";$bibtex_fields["conference"][]="booktitle";
$bibtex_fields["conference"][]="year";$bibtex_fields["conference"][]="editor";
$bibtex_fields["conference"][]="pages";$bibtex_fields["conference"][]="organization";
$bibtex_fields["conference"][]="publisher";$bibtex_fields["conference"][]="address";
$bibtex_fields["conference"][]="month";$bibtex_fields["conference"][]="note";
$bibtex_fields["conference"][]="key";$bibtex_fields["conference"][]="url";
$bibtex_fields["conference"][]="keywords";$bibtex_fields["conference"][]="abstract";
$bibtex_fields["inbook"][]="author";$bibtex_fields["inbook"][]="editor";
$bibtex_fields["inbook"][]="title";$bibtex_fields["inbook"][]="chapter";
$bibtex_fields["inbook"][]="pages";$bibtex_fields["inbook"][]="publisher";
$bibtex_fields["inbook"][]="year";$bibtex_fields["inbook"][]="volume";
$bibtex_fields["inbook"][]="number";$bibtex_fields["inbook"][]="series";
$bibtex_fields["inbook"][]="type";$bibtex_fields["inbook"][]="address";
$bibtex_fields["inbook"][]="edition";$bibtex_fields["inbook"][]="month";
$bibtex_fields["inbook"][]="note";$bibtex_fields["inbook"][]="key";
$bibtex_fields["inbook"][]="url";$bibtex_fields["inbook"][]="keywords";
$bibtex_fields["inbook"][]="abstract";$bibtex_fields["incolletion"][]="author";
$bibtex_fields["incolletion"][]="title";$bibtex_fields["incolletion"][]="booktitle";
$bibtex_fields["incolletion"][]="publisher";$bibtex_fields["incolletion"][]="year";
$bibtex_fields["incolletion"][]="editor";$bibtex_fields["incolletion"][]="volume";
$bibtex_fields["incolletion"][]="number";$bibtex_fields["incolletion"][]="series";
$bibtex_fields["incolletion"][]="type";$bibtex_fields["incolletion"][]="chapter";
$bibtex_fields["incolletion"][]="pages";$bibtex_fields["incolletion"][]="address";
$bibtex_fields["incolletion"][]="edition";$bibtex_fields["incolletion"][]="month";
$bibtex_fields["incolletion"][]="note";$bibtex_fields["incolletion"][]="key";
$bibtex_fields["incolletion"][]="url";$bibtex_fields["incolletion"][]="keywords";
$bibtex_fields["incolletion"][]="abstract";$bibtex_fields["inproceedings"][]="author";
$bibtex_fields["inproceedings"][]="title";$bibtex_fields["inproceedings"][]="booktitle";
$bibtex_fields["inproceedings"][]="year";$bibtex_fields["inproceedings"][]="editor";
$bibtex_fields["inproceedings"][]="volume";$bibtex_fields["inproceedings"][]="number";
$bibtex_fields["inproceedings"][]="series";$bibtex_fields["inproceedings"][]="pages";
$bibtex_fields["inproceedings"][]="address";$bibtex_fields["inproceedings"][]="month";
$bibtex_fields["inproceedings"][]="organization";$bibtex_fields["inproceedings"][]="publisher";
$bibtex_fields["inproceedings"][]="note";$bibtex_fields["inproceedings"][]="note";
$bibtex_fields["inproceedings"][]="key";$bibtex_fields["inproceedings"][]="url";
$bibtex_fields["inproceedings"][]="keywords";$bibtex_fields["inproceedings"][]="abstract";
$bibtex_fields["manual"][]="title";$bibtex_fields["manual"][]="author";
$bibtex_fields["manual"][]="organization";$bibtex_fields["manual"][]="address";
$bibtex_fields["manual"][]="edition";$bibtex_fields["manual"][]="month";
$bibtex_fields["manual"][]="year";$bibtex_fields["manual"][]="note";
$bibtex_fields["manual"][]="key";$bibtex_fields["manual"][]="url";
$bibtex_fields["manual"][]="keywords";$bibtex_fields["manual"][]="abstract";
$bibtex_fields["mastersthesis"][]="author";$bibtex_fields["mastersthesis"][]="title";
$bibtex_fields["mastersthesis"][]="school";$bibtex_fields["mastersthesis"][]="year";
$bibtex_fields["mastersthesis"][]="type";$bibtex_fields["mastersthesis"][]="address";
$bibtex_fields["mastersthesis"][]="month";$bibtex_fields["mastersthesis"][]="note";
$bibtex_fields["mastersthesis"][]="key";$bibtex_fields["mastersthesis"][]="url";
$bibtex_fields["mastersthesis"][]="keywords";$bibtex_fields["mastersthesis"][]="abstract";
$bibtex_fields["misc"][]="author";$bibtex_fields["misc"][]="title";
$bibtex_fields["misc"][]="howpublished";$bibtex_fields["misc"][]="month";
$bibtex_fields["misc"][]="year";$bibtex_fields["misc"][]="note";
$bibtex_fields["misc"][]="key";$bibtex_fields["misc"][]="url";
$bibtex_fields["misc"][]="keywords";$bibtex_fields["misc"][]="abstract";
$bibtex_fields["phdthesis"][]="author";$bibtex_fields["phdthesis"][]="title";
$bibtex_fields["phdthesis"][]="school";$bibtex_fields["phdthesis"][]="year";
$bibtex_fields["phdthesis"][]="type";$bibtex_fields["phdthesis"][]="address";
$bibtex_fields["phdthesis"][]="month";$bibtex_fields["phdthesis"][]="note";
$bibtex_fields["phdthesis"][]="key";$bibtex_fields["phdthesis"][]="url";
$bibtex_fields["phdthesis"][]="keywords";$bibtex_fields["phdthesis"][]="abstract";
$bibtex_fields["proceedings"][]="title";$bibtex_fields["proceedings"][]="year";
$bibtex_fields["proceedings"][]="editor";$bibtex_fields["proceedings"][]="volume";
$bibtex_fields["proceedings"][]="number";$bibtex_fields["proceedings"][]="series";
$bibtex_fields["proceedings"][]="address";$bibtex_fields["proceedings"][]="month";
$bibtex_fields["proceedings"][]="organization";$bibtex_fields["proceedings"][]="publisher";
$bibtex_fields["proceedings"][]="note";$bibtex_fields["proceedings"][]="key";
$bibtex_fields["proceedings"][]="url";$bibtex_fields["proceedings"][]="keywords";
$bibtex_fields["proceedings"][]="abstract";$bibtex_fields["techreport"][]="author";
$bibtex_fields["techreport"][]="title";$bibtex_fields["techreport"][]="institution";
$bibtex_fields["techreport"][]="year";$bibtex_fields["techreport"][]="type";
$bibtex_fields["techreport"][]="number";$bibtex_fields["techreport"][]="address";
$bibtex_fields["techreport"][]="month";$bibtex_fields["techreport"][]="note";
$bibtex_fields["techreport"][]="key";$bibtex_fields["techreport"][]="url";
$bibtex_fields["techreport"][]="keywords";$bibtex_fields["techreport"][]="abstract";
$bibtex_fields["unpublished"][]="author";$bibtex_fields["unpublished"][]="title";
$bibtex_fields["unpublished"][]="note";$bibtex_fields["unpublished"][]="month";
$bibtex_fields["unpublished"][]="year";$bibtex_fields["unpublished"][]="key";
$bibtex_fields["unpublished"][]="url";$bibtex_fields["unpublished"][]="keywords";
$bibtex_fields["unpublished"][]="abstract"; // ?>  ## Bibtex Required/Optional - for your wiki • Latex defines three types of fields: • Required - always displayed • Optional - usually not used • Ignored - never used, can be arbitrary @article{citation_key, author = {}, title = {}, journal = {}, year = {}, volume = {}, number = {}, pages = {}, month = {}, note = {}, key = {}, url = {}, keywords = {}, abstract = {} }  @book{citation_key, author = {}, editor = {}, % author OR editor required title = {}, publisher = {}, year = {}, volume = {}, number = {}, % volume OR number series = {}, address = {}, edition = {}, month = {}, note = {}, key = {}, url = {}, keywords = {}, abstract = {} }  @conference{citation_key, author = {}, title = {}, booktitle = {}, year = {}, editor = {}, pages = {}, organization = {}, publisher = {}, address = {}, month = {}, note = {}, key = {}, url = {}, keywords = {}, abstract = {} }  @inbook{citation_key, author = {}, editor = {}, % author OR editor title = {}, chapter = {}, pages = {}, % chapter AND/OR pages publisher = {}, year = {}, volume = {}, number = {}, % volume OR number series = {}, type = {}, address = {}, edition = {}, month = {}, note = {}, key = {}, url = {}, keywords = {}, abstract = {} }  @incollection{citation_key, author = {}, title = {}, booktitle = {}, % booktitle should be exactly the same as title? Not sure. publisher = {}, year = {}, editor = {}, volume = {}, number = {}, % volume OR number series = {}, type = {}, chapter = {}, pages = {}, address = {}, edition = {}, month = {}, note = {}, key = {}, url = {}, keywords = {}, abstract = {} }  @inproceedings{citation_key, author = {}, title = {}, booktitle = {}, % booktitle should be exactly the same as title? Some kind of bug? Not sure. year = {}, editor = {}, volume = {}, number = {}, % volume OR number series = {}, pages = {}, address = {}, month = {}, organization = {}, publisher = {}, note = {}, key = {}, url = {}, keywords = {}, abstract = {} }  @manual{citation_key, title = {}, author = {}, organization = {}, address = {}, edition = {}, month = {}, year = {}, note = {}, key = {}, url = {}, keywords = {}, abstract = {} }  @mastersthesis{citation_key, author = {}, title = {}, school = {}, year = {}, type = {}, address = {}, month = {}, note = {}, key = {}, url = {}, keywords = {}, abstract = {} }  @misc{citation_key, author = {}, title = {}, howpublished = {}, month = {}, year = {}, note = {}, key = {}, url = {}, keywords = {}, abstract = {} }  @phdthesis{citation_key, author = {}, title = {}, school = {}, year = {}, type = {}, address = {}, month = {}, note = {}, key = {}, url = {}, keywords = {}, abstract = {} }  @proceedings{citation_key, title = {}, year = {}, editor = {}, volume = {}, number = {}, % volume OR number series = {}, address = {}, month = {}, organization = {}, publisher = {}, note = {}, key = {}, url = {}, keywords = {}, abstract = {} }  @techreport{citation_key, author = {}, title = {}, institution = {}, year = {}, type = {}, number = {}, address = {}, month = {}, note = {}, key = {}, url = {}, keywords = {}, abstract = {} }  @unpublished{citation_key, author = {}, title = {}, note = {}, month = {}, year = {}, key = {}, url = {}, keywords = {}, abstract = {} }  ## Types of Bibtex entries = for your wiki • There are 14 available entry types. article An article from a journal or magazine. A book with an explicit publisher. A part of a book, usually untitled; may be a chapter and/or a range of pages. Use with book to reference a set of pages. A part of a book with its own title. A work that is printed and bound, but without a named publisher or sponsoring institution. DO NOT USE! Included for compatibility. Technical documentation. A Master's thesis. Use this type when nothing else seems appropriate. A PhD thesis. The proceedings of a conference. An article in the proceedings of a conference. Use with proceedings to reference a sub-paper or sub-section. A report published by a school or other institution, usually numbered within a series. A document with an author and title, but not formally published. ## Bibtex - standard fields - for your wiki • The available fields depend on which entry type is being used. Each entry type has required and optional arguments. address Usually the address of the publishwer or institution. For major publishing houses, omit it entirely or just give the city. For small publishers, you can help the reader by giving the complete address. An annotation. It i not used by the standard bibliography styles, but may be used by other styles that produce an annotated bibliography. The name(s) of the author(s). Separate the names with 'and' no quotes. If there are many names, list the prominent ones and the last one as 'et al' no quotes. Most names can be entered "Last, First MI" or "First MI Last". Last names with two capitalized words need to be in the "Last1 Last2, First MI" format. If there is a Jr. in the name, use "Last, Jr., First". Accented letters should be enclosed in braces, {}. For example, "Kurt G{\"{o}}del". The title of a book, a titled part of which is being cited. It is used only for the Incollection and Inproceedings entry types; use the title field for book entries. How to type titles is explained in titles. A chapter (or other sectional unit) number. The database key of the entry being cross-referenced. The edition of a book -- for example, "Second". (The style will convert to lowercase if needed.) The name(s) of editor(s), typed as indicated above. If there is also an author field, then the editor field gives the editor of the book or collection in which the reference appears. How something strange was published. The sponsoring institution of a technical report. A journal name. Abbreviations may exist; see the Local Guide. Used for alphabetizing and creating a label when the author and editor fields are missing. This field should not be confused with the key that appears in the \cite{} command and at the beginning of the entry. The month in which the workd was published or, for an unpublished work, in which it was written. Use the standard three-letter abbreviations. Any additional information that can help the reader. The first word should be capitalized. The number of a journal, magazine, technical paper, or work in a series. An issue of a journal or magazine is usually identified by its volume and number; the organization that issues a technical report usually gives it a number; books in a names series are sometimes numbered. The organization that sponsors a conference or that publishes a manual. One or more page numbers or ranges of numbers, such as 42--111 or 7,41,73--97. The publisher's name. The name of the school where a thesis was written. The name of a series or set of books. When citing an entire book, the title field gives its title and the optional series field gives the name of a series or multivolume set in which the book was published. The work's title. The bibliography style determines whether or not a title is capitalized; the titles of books usually are, titles of articles usually not. Always type the title as if it were capitalized. Always capitalize the first word of the title, the first word after a colon, and all other words except articles and unstressed conjunctions (and, or, if) and prepositions. BIBTEX will change case as needed. If BIBTEX should not change an uppercase to lowercase, then enclose it in braces {}. Example, "Out of {Africa}" and "Out of {A}frica" are equivalent. The type of a technical report - for example, "Research Note". It is also used to specify a type of sectional unit in an inbook or incollection entry and a different type of thesis in a mastersthesis or phdthesis entry. The year of publication or, for an unpublished work, the year it was written. It usually consists only of numerals, such as 1984, but could also be something like circa 1066. ## Bibtex Nonstandard / Optional Fields - for your wiki • The available fields depend on which entry type is being used. Each entry type has required and optional arguments. affiliation The authors affiliation. An abstract of the work. A Table of Contents Copyright information. The International Standard Book Number. The International Standard Serial Number. Used to identify a journal. Key words used for searching or possibly for annotation. The language the document is in. A location associated with the entry, such as the city in which a conference took place. The Library of Congress Call Number. The Mathematical Reviews number. The WWW Universal Resource Locator that points to the item being referenced. This often is used for technical reports to point to the ftp site where the postscript source of the report is located. ## Get rid of temporary files using proc_open (read and write pipes connected to the htmldoc process) you can get rid of temporary files. This also fixes a variable conflict ($link): jhoetzel

<?php
# Extension:PdfBook
# - Licenced under LGPL (http://www.gnu.org/copyleft/lesser.html)
# - Started: 2007-08-08

if (!defined('MEDIAWIKI')) die('Not an entry point.');

define('PDFBOOK_VERSION','0.0.12, 2008-06-22');

$wgPdfBookMagic = "book";$wgExtensionFunctions[]        = 'wfSetupPdfBook';
$wgHooks['LanguageGetMagic'][] = 'wfPdfBookLanguageGetMagic';$wgExtensionCredits['parserhook'][] = array(
'name'	      => 'Pdf Book',
'description' => 'Composes a book from articles in a category and exports as a PDF book',
'url'	      => 'http://www.mediawiki.org/wiki/Extension:Pdf_Book',
'version'     => PDFBOOK_VERSION
);

class PdfBook {

# Constructor
function PdfBook() {
global $wgHooks,$wgParser,$wgPdfBookMagic;$wgParser->setFunctionHook($wgPdfBookMagic,array($this,'magicBook'));
$wgHooks['UnknownAction'][] =$this;

# Add a new pdf log type
global $wgLogTypes,$wgLogNames,$wgLogHeaders,$wgLogActions;
$wgLogTypes[] = 'pdf';$wgLogNames  ['pdf']      = 'pdflogpage';
$wgLogHeaders['pdf'] = 'pdflogpagetext';$wgLogActions['pdf/book'] = 'pdflogentry';
}

# Expand the book-magic
function magicBook(&$parser) { # Populate$argv with both named and numeric parameters
$argv = array(); foreach (func_get_args() as$arg) if (!is_object($arg)) { if (preg_match('/^(.+?)\\s*=\\s*(.+)$/',$arg,$match)) $argv[$match[1]] = $match[2]; else$argv[] = $arg; } return$text;
}

function onUnknownAction($action,$article) {
global $wgOut,$wgUser,$wgTitle,$wgParser;
global $wgServer,$wgArticlePath,$wgScriptPath,$wgUploadPath,$wgUploadDirectory,$wgScript;

if ($action == 'pdfbook') { # Log the export$msg = $wgUser->getUserPage()->getPrefixedText().' exported as a PDF book';$log = new LogPage('pdf',false);
$log->addEntry('book',$wgTitle,$msg); # Initialise PDF variables$layout  = '--firstpage toc';
$left =$this->setProperty('LeftMargin',  '1cm');
$right =$this->setProperty('RightMargin', '1cm');
$top =$this->setProperty('TopMargin',   '1cm');
$bottom =$this->setProperty('BottomMargin','1cm');
$font =$this->setProperty('Font',	'Arial');
$size =$this->setProperty('FontSize',    '8');
$linkc =$this->setProperty('LinkColour',  '217A28');
$levels =$this->setProperty('TocLevels',   '2');
$exclude =$this->setProperty('Exclude',     array());
if (!is_array($exclude))$exclude = split('\\s*,\\s*',$exclude); # Select articles from members if a category or links in content if not$articles = array();
$title =$article->getTitle();
$opt = ParserOptions::newFromUser($wgUser);
if ($title->getNamespace() == NS_CATEGORY) {$db     = &wfGetDB(DB_SLAVE);
$cat =$db->addQuotes($title->getDBkey());$result = $db->select( 'categorylinks', 'cl_from', "cl_to =$cat",
'PdfBook',
array('ORDER BY' => 'cl_sortkey')
);
if ($result instanceof ResultWrapper)$result = $result->result; while ($row = $db->fetchRow($result)) $articles[] = Title::newFromID($row[0]);
}
else {
$text =$article->fetchContent();
$text =$wgParser->preprocess($text,$title,$opt); if (preg_match_all('/^\\*\\s*\${2}\\s*([^\\|\$]+)\\s*.*?\\]{2}/m',$text,$links)) foreach ($links[1] as $link)$articles[] = Title::newFromText($link); } # Format the article's as a single HTML document with absolute URL's$book	  = $title->getText();$html	  = '';
$wgArticlePath =$wgServer.$wgArticlePath;$wgScriptPath  = $wgServer.$wgScriptPath;
$wgUploadPath =$wgServer.$wgUploadPath;$wgScript      = $wgServer.$wgScript;
foreach ($articles as$title) {
$ttext =$title->getPrefixedText();
if (!in_array($ttext,$exclude)) {
$article = new Article($title);
$text =$article->fetchContent();
$text = preg_replace('/<!--([^@]+?)-->/s','@@'.'@@$1@@'.'@@',$text); # preserve HTML comments$text   .= '__NOTOC__';
$opt->setEditSection(false); # remove section-edit links$wgOut->setHTMLTitle($ttext); # use this so DISPLAYTITLE magic works$out     = $wgParser->parse($text,$title,$opt,true,true);
$ttext =$wgOut->getHTMLTitle();
$text =$out->getText();
$text = preg_replace('|(<img[^>]+?src=")(/.+?>)|',"$1$wgServer$2",$text);$text    = preg_replace('|@{4}([^@]+?)@{4}|s','<!--$1-->',$text); # HTML comments hack
$text = preg_replace('|<table|','<table border borderwidth=2 cellpadding=3 cellspacing=0',$text);
$ttext = basename($ttext);
$html .= utf8_decode("<h1>$ttext</h1>$text\n"); } } # If format=html in query-string, return html content directly if (isset($_REQUEST['format']) && $_REQUEST['format'] == 'html') {$wgOut->disable();
header("Content-Disposition: attachment; filename=\"$book.html\""); print$html;
}
else {
# Send the file to the client via htmldoc converter
$wgOut->disable(); header("Content-Type: application/pdf"); header("Content-Disposition: attachment; filename=\"$book.pdf\"");
$cmd = "--left$left --right $right --top$top --bottom $bottom";$cmd .= " --header ... --footer .1. --headfootsize 8 --quiet --jpeg --color";
$cmd .= " --bodyfont$font --fontsize $size --linkstyle plain --linkcolor$linkc";
$cmd .= " --toclevels$levels --format pdf14 --numbered $layout";$cmd  = "htmldoc -t pdf --charset iso-8859-1 $cmd -"; putenv("HTMLDOC_NOCGI=1");$process = proc_open("$cmd" , array(0 => array("pipe", "r"), 1 => array("pipe", "w")),$pipes);

fwrite($pipes[0],$html);
fclose($pipes[0]); fpassthru($pipes[1]);
fclose($pipes[1]); proc_close($process);
}
return false;
}

return true;
}

# Return a property for htmldoc using global, request or passed default
function setProperty($name,$default) {
if (isset($_REQUEST["pdf$name"]))      return $_REQUEST["pdf$name"];
if (isset($GLOBALS["wgPdfBook$name"])) return $GLOBALS["wgPdfBook$name"];
return $default; } # Needed in some versions to prevent Special:Version from breaking function __toString() { return 'PdfBook'; } } # Called from$wgExtensionFunctions array when initialising extensions
function wfSetupPdfBook() {
global $wgPdfBook;$wgPdfBook = new PdfBook();
}

# Needed in MediaWiki >1.8.0 for magic word hooks to work properly
function wfPdfBookLanguageGetMagic(&$magicWords,$langCode = 0) {
global $wgPdfBookMagic;$magicWords[$wgPdfBookMagic] = array(0,$wgPdfBookMagic);
return true;
}



## Unset variables and blank pdfs...

Ok, I know that this is the third post on this subject, but I am still having problems and haven't had much success in debugging the problem. I am on PDFBook version 1.0.0, MediaWiki 1.10.1

We are trying to use pdfbook on our company website. We use drupal for authentication (I don't know it's affects on MediaWiki)

I was getting blank pdf and html documents. I see you pull several global variables on lines 70-71  

 global $wgOut,$wgUser, $wgTitle,$wgParser; global $wgServer,$wgArticlePath, $wgScriptPath,$wgUploadPath, $wgUploadDirectory,$wgScript; 

I checked the variables and found that all of them are blank except: wgServer wgArticlePath wgScriptPath

I could not find the others in the entire $GLOBALS variable... I'm not SUPER familiar with MediaWiki's structure and backend, but I would imagine that many of those (especially$wgUser) should be set.

Any ideas?
--Greg 18:23, 22 October 2008 (UTC)

## hiding numbering on headings and article title

great extension! is it possible to hide the numbered headings when printing as a book? i noticed that the extension disregards the user preference and __NONUMBEREDHEADINGS__. we are trying to PDF print a "book" of data entry forms and the heading numbers are not required. thanks --Erikvw 06:23, 18 November 2008 (UTC)

What i have done for now is to remove --numbered from the line

$cmd .= "$toc --format pdf14 --numbered $layout$width";


which seems to work fine.

## revision id does not appear on pdf

we are tracking revision information for the printed document using

{{REVISIONID}}-{{REVISIONTIMESTAMP}}


When printing the PDF, the REVISIONTIMESTAMP prints but REVISIONID does not. I noticed the same for Pdf_Export. Any ideas? thanks ----Erikvw 04:24, 19 November 2008 (UTC)

## some special characters and german umlauts result in empty pdf files

When we try to receive categories with umlauts (e.g. "Übersicht") or special characters like "-" in the category name the generated pdf file is empty. Everything else runs real fine and smooth. Great extension.
Any workaround or help regarding this problem would be appreciated. --Fydel 12:29, 15 December 2008 (UTC)

I found a simple workaround for that issue. I changed the line where htmldoc is called
escapeshellcmd($cmd);  to passthru(escapeshellcmd($cmd));

--Fydel 09:10, 9 January 2009 (UTC)
Hello, I´ve the same problem. The umlaut in the middle of the heading is correct. But at the beginning the umlaut is not visible. There is nothing.
--141.35.213.221 09:24, 19 August 2011 (UTC)

## Page Limit in PdfBook

How many pages can be fetched using Extension:PdfBook??? Is there any limit for that??

I found a copy that is hosted at sourceforge as part of the install for Flowchartwiki and it has extras like the checkPDFbook stuff.

Is there somewhere to get the latest version of the whole thing, not just the pdfbook.php file?

Cheers.

## ASHighlight

The bug with ASHighlight is probably to do with the way that ASHighlight embeds the 'highlight' function's CSS stylesheet output. It's a while since I've done anything with ASHighlight, but I remember this part of it being a bit hacky. Hope this helps. Jdpipe 07:27, 24 March 2009 (UTC)

One way to provide for "compatibility" between the two extensions is to "drop" <style> ... </style> tags before feeding htmldoc command. Could lead to some unexpected results ... but IWFM :

• In PdfBook.php, there is a main loop for # Format the article(s) as a single HTML document with absolute URL's
• just add there the following line among the other existing preg_replace control lines
$text = preg_replace( '|<style(.+?)</style>|s', ' <!-- <nostyle/> -->',$text );                  # Style CSS hack


--Eric Salomé (@ctx.net) 22:49, 30 August 2010 (UTC)

## No such extension "PdfBook"

Try to download it and get "No such extension "PdfBook" ". How can I get this extension? --Robinson Weijman 10:39, 19 June 2009 (UTC)

You can download it from Subversion --Rius 14:20, 19 June 2009 (UTC)

Thanks for the tip. I cannot find it - do you have a link? --Robinson Weijman 09:45, 22 June 2009 (UTC)
I'm sorry, there is a link on this article page. --Robinson Weijman 09:48, 22 June 2009 (UTC)

"First Htmldoc needs to be installed [...].  Windows Binary can be found
here (v1.8.24) [...]."


Hi, I updated the link. Cheers --kgh 19:12, 5 December 2009 (UTC)

## Italian charset

I have a wiki in Italian and i have tried many charsets but i just can't find the correct one. The apostrophe gets turned into a Question mark anytime the pdf gets rendered. Modo D'Uso becomes Modo D?Uso. I also have a wiki in English and when i type can't it comes out can't. It's not a special character it's an apostrophe. I now it's something stupid but i don't know which charset to use or don't know a work around. I usually use iso8859-1 with no problems. I just can't wrap my head around it. Thank you for your help in advance.

I believe i have resolved the problem. I changed my charset to utf-8 and where the apostrophe is i replacing it with & acute ;(but with no spacing between the characters). It now comes out an apostrophe everytime.

I created a parameter that can be defined in the wiki settings that will disable the printing of links when creating PDF Books. The parameter is $wgPDFBookIgnoreLink. By default it is set to false. 34,35c34,35 < < function PdfBook() { --- > var$ignoreLinks = false;
> 	function PdfBook($ignoreLinks = false) { 44a45,46 > >$this->ignoreLinks=$ignoreLinks; 53c55,56 < --- > global$wgPDFBookIgnoreLink;
>
128a132,135
> 					if($this->ignoreLinks){ >$text    = str_ireplace('<a','<span',$text); >$text    = str_ireplace('</a>','</span>',$text); > } 193c200,201 <$wgPdfBook = new PdfBook();
---
> 	global $wgPDFBookIgnoreLinks; >$wgPdfBook = new PdfBook($wgPDFBookIgnoreLinks);  J.saterfiel 14:51, 16 September 2009 (UTC) The links (other than the TOC) in my PDFs refer back to the wiki, not to the location in the book. I want to preserve these internal links for online users. Is there a way to format the links to do that? User:Dlpetry:DlPetry 05:41, 09 September 2011 (UTC) ## Permission denied I upgraded from mediawiki 1.12 to 1.15 and now I get invalid pdf files with this error inside when I open it with an editor(The Update Directory is set to images):  Warning: fopen(/home/.../mediawiki/images/pdf-book4aba1672a5b66) [<a href='function.fopen'>function.fopen</a>]: failed to open stream: Permission denied in /home/.../mediawiki/extensions/PdfBook/PdfBook.php on line 146 Warning: fwrite(): supplied argument is not a valid stream resource in /home/.../mediawiki/extensions/PdfBook/PdfBook.php on line 147 Warning: fclose(): supplied argument is not a valid stream resource in /home/.../mediawiki/extensions/PdfBook/PdfBook.php on line 148  I set both the content of images and the PdfBook-Folder as executable with chmod 755 and they both have the same owner (root). Laquestianne, 23. September ## No Images I have Version 1.14 and since I upgraded from 1.12 I have no images anymore in PDF´s. I tried allready the 777 on ./images but didnt help Any help on this ? ## Still broken? I see a lot of people have problems with this extension, and I am one of them. Has anyone gotten a pdf that is longer than 3bytes long using mw1.15 and PdfBook (Version 1.0.3, 2008-12-09)? I see no php errors in my error log. ## Problems with a few categories I have the problem, that the pdfbook can't create pdfs from all categories. For a few categories it works without problems and other categories don't work. For example I have a category Server, if i want to create a pdf of that category only a blank browserpage opens. I don't know where the problem is, the categories are not very big (about 20 pages), there are no special characters in the category-title,it's not a problem with htmldoc,... For a test I have created a new category with the name Servers. Then I put the content of Server in the new category Servers. The creation of pdf for Servers works fine. So it seems to me that's a problem with the name of the category and not with the content. Thank you for help! ## Empty Pdf, PDFBook Problem Hi, I try to use the mediawiki and the pdfbook extension. I have put the extension to the extension folder and included in the LocalSettings.php. I have installed the htmldoc as well. I am using IIS 5.1. When I put the &action=pdfbook to the URL it creates only an empty pdf. What can be wrong? I have only installed the htmldoc, should I make more settings with it? Br, Zsolt --Nwessel 08:57, 19 October 2010 (UTC) Please notice that PDFBook works with &action=pdfbook only on categories. When you want to create a pdf for a single page you need to add "&format=single". I am also having the same issue. Mediawiki 1.29 on Windows IIS and MySQL. pdf downloads as 0 byte blank file. I have tried adding &format=single but I continue to get the same result. Is there something that has to be configured with HTMLDOC that I am missing? ## Valid fonts Can anyone give me the complete list of fonts supported for the $wgPdfBookFont setting? Is this dependent on htmldoc or on the system fonts?

I checked the fonts that htmldoc supports, and their FAQ said:

HTMLDOC 1.8.20 and higher support embedding of the base Type 1 fonts: Courier, Helvetica, Symbol, and Times. HTMLDOC does not currently allow embedding of arbitrary fonts specified by the HTML FONT element.

But then I see that the default setting for $wgPdfBookFont is Arial so I'm confused. I'm running MediaWiki on Ubuntu and I have installed the ttf-mscorefonts-installer package, however when I set $wgPdfBookFont to "Verdana" I get Times instead :(

## Altered version with new features

Some people have complianed about not being able to use the diff code I provided in earlier notes. Because this project isn't being updated I've put full docs on my user page (J.saterfiel) for the altered version. Here is the PdfBook.php I use on my mediawiki installation (1.14). It's a little more advanced than the one currently available.

List of new features:

• Ability to remove links in the documents
• Ability for a printed Category (collection of articles) to have a cover page with its Category Name and date created printed on it
• Ability to have a "Download as PDF" link in the tool bar on any page without needing to explicitly place a link on a page you want to create pdfs on.
• Ability to change the date format used on the header page (http://us3.php.net/manual/en/function.date.php)
• Ability to change the information printed on each page header and footer(will need to lookup htmldoc http://www.htmldoc.org/ for more info on what the options are and once installed run htmldoc -help as the full options are not displayed on their website.)

--J.saterfiel 15:05, 27 May 2010 (UTC)

## Broken with MW 1.16

I recently upgraded from MW 1.11 to 1.16 and I'm having some trouble with this extension. The issue is that pdf's of categories do not work properly. The pdf is created fine, but it uses name of first page in the category for each entry in the pdf unless you put a Heading 1 in the page. If you do put a heading 1 in the page, then it creates a page in between each page with name of the first page in the category.

Expected behavior (worked in MW 1.11), Fruit.pdf (Fruit is the name of the category):

• Apples
• Bananas
• Cantaloupe

Actual behavior, Fruit.pdf:

• Apples
• (stuff about apples, first page in the category)
• Apples
• (stuff about bananas, since bananas doesn't have a heading 1 saying bananas at the top of the page)
• Apples
• (completely blank page, since cantaloupe page below has a heading 1)
• Cantaloupe

Can anybody help?

This worked for me

I commented line 129

//$ttext =$wgOut->getHTMLTitle();

That worked, thanks!

line 124 for me

## Math rendering is very ugly

In pdfs, I'm getting some ugly rendering of mathematical expressions. Symbols are abnormally large, 3 times larger than normal text and resolution is poor. Is this normal? Is Pdfbook causing this? In the wiki they are rendered fine, it's just in pdfs that the problem occurs. For example, try:

$\langle T,\mu \rangle$

Running Pdfbook Version 1.0.4, 2010-01-05 Pgr94 08:39, 12 September 2010 (UTC)

## Nothing happens when action url entered

I am running Media wiki 1.16.0 and PdfBook 1.0.4. and have htmldoc installed on server. I installed the extension in the extensions directory and I have added the require line in the Localsettings. but when I enter the url: mywiki.com/wiki/index.php/Category:Software_Documentation&action=pdfbook Nothing happens but the token wiki message "There is currently no text in this page." I tried a different browser, checked the apache error logs, and made sure the /images directory is writable to the web server. All of which gave me no errors or different response. Can someone please give me a push in the right direction here.... Thanks, Melissa

That's not the right URL. Take another look at the instructions and follow the syntax there more closely. —Emufarmers(T|C) 23:23, 3 February 2011 (UTC)

You were right...

## How to export all articles to a single file

I can create PDF's containing a single category using the following url-call >> http://mywikibox/wikis/wiki_a/index.php?title=Category:CATNAME&action=pdfbook That’s fine so far in my MW 1.15.x setup using latest PdfBook trunk.

My question is simple: How to generate a PDF containing all articles of the entire wiki without creating a new category and adding all articles to that new category.

## Problem exporting pdfbook: all category titles (chapters) are the same name

See here for a solution. Cheers --[[kgh]] 16:15, 26 February 2011 (UTC)

## Patch filed: Error on single article with &action=pdfbook

Ahoy,

you probably have this problem often: You add the &action=pdfbook GET parameter in your web browser, but you get an empty PDF file, because you have not selected a category, nor added the &format=single GET parameter. Annoying.

I don't want the &format=single GET parameter to be mandatory for single files. This extension can figure out, whether the selected page is a category page. So I added two lines and it is not necessary any more.

# svn diff
Index: PdfBook.php
===================================================================
--- PdfBook.php (Revision 82953)
+++ PdfBook.php (Arbeitskopie)
@@ -114,6 +114,8 @@
$text =$wgParser->preprocess( $text,$title, $opt ); if ( preg_match_all( "/^\\*\\s*\${2}\\s*([^\\|\$]+)\\s*.*?\\]{2}/m",$text, $links ) ) foreach ($links[1] as $link )$articles[] = Title::newFromText( $link ); + else +$articles = array( $title ); } }  Voilà, here we go. Best regards, --Mquintus 21:00, 28 February 2011 (UTC) ## Vector Skin Sidebar Link for Pdf Print Does anyone know how I can add the pdfbook link on the navigation/sidebar, on the vector skin same as on the monobook skin? Has anyone got any ideas? Found it from above under the section 'Requests' # Create toolbox link$wgHooks['SkinTemplateToolboxEnd'][] = 'fnPDFBookLink';

function fnPDFBookLink( &$vector ) { global$wgMessageCache, $wgPdfBookMessages; foreach($wgPdfBookMessages as $lang =>$messages ) {
$wgMessageCache->addMessages($messages, $lang ); }$thispage = $vector->data['thispage']; // e.g. "Category:Wiki"$nsnumber = $vector->data['nsnumber']; // NS 14 is category if ($nsnumber == 14 ){
echo "\n\t\t\t\t<li><a href=\"./$thispage?action=pdfbook\">";$vector->msg( 'pdf_book_link' );
echo "</a></li>\n";
}
return true;
}


## Grabbing one more link level subpage to PDF (one more indent)

I needed to follow one more link level inside, to complete the product manual. Its working on product page

101a102
>                                       $articles[] =$title;
103c104,113
<                                               foreach ( $links[1] as$link ) $articles[] = Title::newFromText($link );
---
>                                               foreach ( $links[1] as$link ) {
>                                                       $articles[] = Title::newFromText($link );
>                                                       $subarticle = new Article ( Title::newFromText($link ) );
>                                                       $text2 =$subarticle->fetchContent();

## FlaggedRevs and PDFBook

Hello, I have a Problem, I want to print the current flagged Version of my article. But PDFBook uses the last edit version (article.php does not provides any other functions)so I try to include the FlaggesRevs Classes in the PDFBook and became an error. I´ve no idea why it does not work.

Someone here who know the problem? I will use the FlaggedRevs Classes to use the last flagged Version.

--141.35.213.221 09:52, 16 September 2011 (UTC)

## Slightly different version

Excuse my lack of knowledge in how to update Wiki correctly, but I'm editing Boldly, so...

Some of the comments above revolve around:

• Not being able to use this extension on a historical document
• Other extensions not resolving

I've done a lot of work on this extension to modify it for my needs, under v1.16. It also includes a lot of the earlier comments and solutions included in this version. And of course resolves the two issues I listed above.

My version has a lot of bespoke coding (strongly formatting documents based on their name, for example), so it's not reasonable to push all that into the mainstream.

So, for those who need it, the code is below.

<?php
/**
* PdfBook extension
* - Composes a book from articles in a category and exports as a PDF book
*
* See http://www.mediawiki.org/Extension:PdfBook for installation and usage details
* See http://www.organicdesign.co.nz/Extension_talk:PdfBook for development notes and disucssion
*
* Started: 2007-08-08
*
* @package MediaWiki
* @subpackage Extensions
* @licence GNU General Public Licence 2.0 or later
*
*/
if (!defined('MEDIAWIKI')) die('Not an entry point.');

define('PDFBOOK_VERSION', '1.0.3, 2008-12-09');

$wgExtensionFunctions[] = 'wfSetupPdfBook';$wgHooks['LanguageGetMagic'][] = 'wfPdfBookLanguageGetMagic';

$wgExtensionCredits['parserhook'][] = array( 'name' => 'PdfBook', 'author' => '[http://www.organicdesign.co.nz/nad User:Nad]', 'description' => 'Composes a book from articles in a category and exports as a PDF book', 'url' => 'http://www.mediawiki.org/wiki/Extension:PdfBook', 'version' => PDFBOOK_VERSION ); class PdfBook { function PdfBook() { global$wgHooks, $wgParser,$wgPdfBookMagic;
global $wgLogTypes,$wgLogNames, $wgLogHeaders,$wgLogActions;
$wgHooks['UnknownAction'][] =$this;

# Add a new pdf log type
$wgLogTypes[] = 'pdf';$wgLogNames  ['pdf']      = 'pdflogpage';
$wgLogHeaders['pdf'] = 'pdflogpagetext';$wgLogActions['pdf/book'] = 'pdflogentry';
}

/**
* Perform the export operation
*/
function onUnknownAction($action,$article) {
global $wgOut,$wgUser, $wgTitle,$wgParser, $wgRequest; global$wgServer, $wgArticlePath,$wgScriptPath, $wgUploadPath,$wgUploadDirectory, $wgScript; if ($action == 'pdfbook') {

$title =$article->getTitle();
$opt = ParserOptions::newFromUser($wgUser);
$oldpage =$wgRequest->getText('oldid');

# Log the export
$msg =$wgUser->getUserPage()->getPrefixedText().' exported as a PDF book';
$log = new LogPage('pdf', false);$log->addEntry('book', $wgTitle,$msg);

# Initialise PDF variables
$format =$wgRequest->getText('format');

# setting the format depending on the document title.
if (substr($title->getText(), 0, 2) == "FS")$format = 'singlebook';
if (substr($title->getText(), 0, 3) == "REQ")$format = 'singlebook';
if (substr($title, 0, 9) == 'Category:') {$format = 'book';
$oldpage=0; } # EOC$notitle = $wgRequest->getText('notitle');$layout  = $format == 'single' ? '--webpage' : '--firstpage c1'; if ($format == 'singlebook') $layout = ' '; //$layout  = $format == 'single' ? ' ' : '--firstpage c1';$charset = $this->setProperty('Charset', 'iso-8859-1');$left    = $this->setProperty('LeftMargin', '1cm');$right   = $this->setProperty('RightMargin', '1cm');$top     = $this->setProperty('TopMargin', '1cm');$bottom  = $this->setProperty('BottomMargin','1cm');$font    = $this->setProperty('Font', 'Arial');$size    = $this->setProperty('FontSize', '8');$linkcol = $this->setProperty('LinkColour', '217A28');$levels  = $this->setProperty('TocLevels', '2');$exclude = $this->setProperty('Exclude', array());$width   = $this->setProperty('Width', '');$width   = $width ? "--browserwidth$width" : '';
if (!is_array($exclude))$exclude = split('\\s*,\\s*', $exclude); # Select articles from members if a category or links in content if not if ($format == 'single' || $format == 'singlebook')$articles = array($title); else {$articles = array();
if ($title->getNamespace() == NS_CATEGORY) {$db     = wfGetDB(DB_SLAVE);
$cat =$db->addQuotes($title->getDBkey());$result = $db->select( 'categorylinks', 'cl_from', "cl_to =$cat",
'PdfBook',
array('ORDER BY' => 'cl_sortkey')
);
if ($result instanceof ResultWrapper)$result = $result->result; while ($row = $db->fetchRow($result)) $articles[] = Title::newFromID($row[0]);
}
else {
$text =$article->fetchContent();
$text =$wgParser->preprocess($text,$title, $opt); if (preg_match_all('/^\\*\\s*\${2}\\s*([^\\|\$]+)\\s*.*?\\]{2}/m',$text, $links)) foreach ($links[1] as $link)$articles[] = Title::newFromText($link); } } # Format the article(s) as a single HTML document with absolute URL's$book = $title->getText();$html = '';
$titlehtml = '';$titledone = 0;
$wgArticlePath =$wgServer.$wgArticlePath;$wgScriptPath  = $wgServer.$wgScriptPath;
$wgUploadPath =$wgServer.$wgUploadPath;$wgScript      = $wgServer.$wgScript;
# Output some basic metadata for HTMLDOC:
$html = "<html><head>"; #if (substr($book, 0, 3) != "EST" && substr($book, 0, 2) != "FS") { #if (substr($book, 0, 3) != "EST") {
$html .= "<title>$book</title>";
#}
foreach ($articles as$title) {
$ttext =$title->getPrefixedText();
if (!in_array($ttext,$exclude)) {
$article = new Article($title);
$text =$article->fetchContent(strlen($oldpage) == 0 ? 0 :$oldpage);
$text = preg_replace('/<!--([^@]+?)-->/s', '@@'.'@@$1@@'.'@@', $text); # preserve HTML comments if ($format != 'single' && $format != 'singlebook')$text .= '__NOTOC__';
$opt->setEditSection(false); # remove section-edit links$wgOut->setHTMLTitle($ttext); # use this so DISPLAYTITLE magic works$text    = $wgParser->preprocess($text, $title,$opt, strlen($oldpage) == 0 ? 0 :$oldpage);
$out =$wgParser->parse($text,$title, $opt, true, true); //$ttext   = $wgOut->getHTMLTitle();$text    = $out->getText();$text    = preg_replace('|(<img[^>]+?src=")(/.+?>)|', "$1$wgServer$2",$text);       # make image urls absolute
$text = preg_replace('|<div\s*class=[\'"]?noprint["\']?>.+?</div>|s', '',$text); # non-printable areas
$text = preg_replace('|@{4}([^@]+?)@{4}|s', '<!--$1-->', $text); # HTML comments hack #$text    = preg_replace('|<table|', '<table border borderwidth=2 cellpadding=3 cellspacing=0', $text); // Ignore Links code$text    = preg_replace('|<a|','<span',$text);$text    = preg_replace('|</a>|','</span>',$text); // EOC //$ttext   = basename($ttext);$h1      = $notitle ? '' : "<h1>$ttext</h1>";
if (strpos($text,'<!-- TOC -->') !== FALSE) {$titlehtml = utf8_decode(substr($text, 0, strpos($text,'<!-- TOC -->')));
$text = utf8_decode(substr($text, strpos($text,'<!-- TOC -->') + 12));$h1 = '';
}

if ($format != 'single' &&$format != 'singlebook' && $titledone == 0) {$titlehtml   = utf8_decode("$text\n");$titledone = 1;
} else {
if (stripos($ttext, "appendix") == true) {$html   .= utf8_decode("$text\n"); } else { if (substr($book, 0, 3) == "EST" || substr($book, 0, 2) == "FS" || substr($book, 0, 3) == "REQ") {
$html .= utf8_decode("$text\n");
} else {
$html .= utf8_decode("$h1$text\n"); } } } } } # Finish off the basic HTML for the production$html .= "</body></html>";

# If format=html in query-string, return html content directly
if ($format == 'html') {$wgOut->disable();
header("Content-Disposition: attachment; filename=\"$book.html\""); print$titlehtml.$html; } else { # Write the HTML to a tmp file$titlefile = "$wgUploadDirectory/".uniqid('pdf-book');$tfh = fopen($titlefile, 'w+'); fwrite($tfh, $titlehtml); fclose($tfh);
$file = "$wgUploadDirectory/".uniqid('pdf-book');
$fh = fopen($file, 'w+');
fwrite($fh,$html);
fclose($fh);$footer = $format == 'single' ? '...' : '../'; //$footer = '../';
$header =$format == 'single' ? '...' : '..t';
$toc =$format == 'single' ? '' : " --toclevels $levels --toctitle \"Contents\" --tocheader$header --tocfooter ..i";
//$toc = " --toclevels$levels --toctitle \"Contents\" --tocheader $header --tocfooter ..i";$cmd  = " --book";
$cmd .= " --links --linkstyle plain --linkcolor$linkcol";
$cmd .= " --title --titlefile$titlefile";
$cmd .= " --size A4 --numbered";$cmd .= " --left $left --right$right --top $top --bottom$bottom ";
$cmd .= " --header$header --header1 $header --footer$footer --nup 1";
$cmd .= "$toc";
$cmd .= " --portrait --color --no-pscommands --no-xrxcomments --compression=9";$cmd .= " --jpeg=75 --fontsize $size --fontspacing 1.1 --headingfont$font --bodyfont $font";$cmd .= " --headfootsize $size --headfootfont$font --charset $charset";$cmd .= " --no-embedfonts --pagemode document --pagelayout single $layout";$cmd .= " --permissions all";
$cmd .= " --browserwidth 680 --no-strict --no-overflow";$cmd  = "htmldoc -t pdf14 $cmd$file";
# Send the file to the client via htmldoc converter
$wgOut->disable(); header("Content-Type: application/pdf"); header("Content-Disposition: attachment; filename=\"$book.pdf\"");
putenv("HTMLDOC_NOCGI=1");
passthru($cmd); @unlink($file);
@unlink($titlefile); } return false; } return true; } /** * Return a property for htmldoc using global, request or passed default */ function setProperty($name, $default) { global$wgRequest;
if ($wgRequest->getText("pdf$name"))   return $wgRequest->getText("pdf$name");
if (isset($GLOBALS["wgPdfBook$name"])) return $GLOBALS["wgPdfBook$name"];
return $default; } /** * Needed in some versions to prevent Special:Version from breaking */ function __toString() { return 'PdfBook'; } } /** * Called from$wgExtensionFunctions array when initialising extensions
*/
function wfSetupPdfBook() {
global $wgPdfBook;$wgPdfBook = new PdfBook();
}

/**
* Needed in MediaWiki >1.8.0 for magic word hooks to work properly
*/
function wfPdfBookLanguageGetMagic(&$magicWords,$langCode = 0) {
global $wgPdfBookMagic;$magicWords[$wgPdfBookMagic] = array($langCode, $wgPdfBookMagic); return true; } //Add on for link to print on the tool bar menu$wgHooks['SkinTemplateBuildNavUrlsNav_urlsAfterPermalink'][] = 'wfSpecialPdfNav';
$wgHooks['SkinTemplateToolboxEnd'][] = 'wfSpecialPdfToolbox'; function wfSpecialPdfNav( &$skintemplate, &$nav_urls, &$oldid, &$revid ) {$nav_urls['pdfprint'] = array(
'href' => $nav_urls['href'].'?action=pdfbook&format=single&notitle&oldid='.$oldid
);
return true;
}

function wfSpecialPdfToolbox( &$monobook ) { if ( isset($monobook->data['nav_urls']['pdfprint'] ) )
if ( $monobook->data['nav_urls']['pdfprint']['href'] == '' ) { ?><li id="t-ispdf"><?php htmlspecialchars($monobook->data['nav_urls']['pdfprint']['text'] ); ?></li><?php
} else {
?><li id="t-pdf"><?php
?><a href="<?php echo htmlspecialchars( $monobook->data['nav_urls']['pdfprint']['href'] ) ?>"><?php echo htmlspecialchars($monobook->data['nav_urls']['pdfprint']['text'] );
?></a><?php
?></li><?php
}
return true;
}


--217.158.90.2 12:56, 31 January 2012 (UTC)

## Corrupt PDF file when opened contains error message

The PDF file created from a page actually contains this:

HTMLDOC Version 1.8.27 Copyright 1997-2006 Easy Software Products, All Rights Reserved.
This software is based in part on the work of the Independent JPEG Group.

ERROR: No HTML files!

Usage:
htmldoc [options] filename1.html [ ... filenameN.html ]

Options:

--batch filename.book
--bodycolor color
--bodyfont {courier,helvetica,monospace,sans,serif,times}
--bodyimage filename.{bmp,gif,jpg,png}
--book
…


Any ideas how to proceed with this?

Thanks,

Gareth.

### Possible Solutions

#### Fix variable quoting

I am running mediawiki on Windows (via a Bitnami installation) and the default path contains a space (the "Program Files" part). PDFBook does not quote the path when executing the htmldoc command and as such the command line is not parsed correctly and the above error is what appears in the PDF file. The solution is to make line 73 in PDFBook.php:

	$cmd = "htmldoc -t pdf --charset$charset $cmd$file";


Look like this:

	$cmd = "htmldoc -t pdf --charset$charset $cmd \"$file\"";


#### Fix SELinux labels

I have just installed PdfBook on a Centos 7 machine, and got the same error. In my case the reason was SELinux preventing httpd from writing to the images directory. Fixing the labels for /path/to/mediawiki/install/images(/.*)? as described in SELinux fixed the issue for me. --217.253.60.186 23:52, 19 January 2016 (UTC)

## $wgPdfBookFormat I wanted to control in LocalSettings.php whether to have format=single or not. This is what I came up with (add$wgPdfBookFormat = "single"; to your LocalSettings.php or not), but it leaks memory, maybe someone with actual PHP knowledge has a better idea. Thanks, 67.164.57.135 04:18, 8 June 2012 (UTC)

--- PdfBook-svn/PdfBook.hooks.php.orig  2012-06-07 23:59:31.142185353 +0200
+++ PdfBook-svn/PdfBook.hooks.php       2012-06-08 06:05:41.806193614 +0200
@@ -143,12 +143,14 @@ class PdfBookHooks {
*/
public static function onSkinTemplateTabs( $skin, &$actions) {
global $wgPdfBookTab; + global$wgPdfBookFormat;
+               if ( $wgPdfBookFormat == "single" ) {$format="&format=single"; } else { $format=""; } if ($wgPdfBookTab ) {
$actions['pdfbook'] = array( 'class' => false, 'text' => wfMsg( 'pdfbook-action' ), - 'href' =>$skin->getTitle()->getLocalURL( "action=pdfbook&format=single" ),
+                               'href' => $skin->getTitle()->getLocalURL( "action=pdfbook$format" ),
);
}
return true;
@@ -160,12 +162,14 @@ class PdfBookHooks {
*/
public static function onSkinTemplateNavigation( $skin, &$actions ) {
global $wgPdfBookTab; + global$wgPdfBookFormat;
+               if ( $wgPdfBookFormat == "single" ) {$format="&format=single"; } else { $format=""; } if ($wgPdfBookTab ) {
$actions['views']['pdfbook'] = array( 'class' => false, 'text' => wfMsg( 'pdfbook-action' ), - 'href' =>$skin->getTitle()->getLocalURL( "action=pdfbook&format=single" ),
+                               'href' => $skin->getTitle()->getLocalURL( "action=pdfbook$format" ),
);
}
return true;


## htmldoc binaries location

I installed htmldoc on Debian 6 up-to-date but I didn't have any binairies in /usr/local/bin as exposed in the setting command but in /usr/bin. It works since I fixed the path but I'm not sure I did the right thing.

Shimegi (talk) 07:03, 3 July 2012 (UTC)

This is correct. Different distributions use different paths --Pastakhov (talk) 07:26, 3 July 2012 (UTC)

When I'm exporting a category, I have a problem:

There are pages A and B in the category.

Pages A and B each have two levels of headers.

What I get in the resulting PDF is:

1) Title of Page A
2) Header Level 1 of Page A
2.1) Header Level 2 of Page A
3) Another Header Level 1 of Page A
4) Title of Page B
5) Header Level 1 of Page B
6) Another Header Level 1 of Page B
6.1) Header Level 2 of Page B


whereas the more correct result IMHO would be:

1) Title of Page A
1.1) Header Level 1 of Page A
1.1.1) Header Level 2 of Page A
1.2) Another Header Level 1 of Page A
2) Title of Page B
2.1) Header Level 1 of Page B
2.2) Another Header Level 1 of Page B
...
2.2.1) Header Level 2 of Page B


You get the idea...

### Attempt to fix

This fix increments the level of each header in a document by one (in a plain way).

 foreach ( $articles as$title ) {
$ttext =$title->getPrefixedText();
if ( !in_array( $ttext,$exclude ) ) {
$article = new Article($title );
$text =$article->fetchContent();
$text = preg_replace( '/<!--([^@]+?)-->/s', '@@'.'@@$1@@'.'@@', $text ); # preserve HTML comments if ($format != 'single' ) $text .= '__NOTOC__';$opt->setEditSection( false );    # remove section-edit links
$wgOut->setHTMLTitle($ttext );   # use this so DISPLAYTITLE magic works
$out =$wgParser->parse( $text,$title, $opt, true, true );$ttext   = $wgOut->getHTMLTitle();$text    = $out->getText();$text    = preg_replace( '|(<img[^>]+?src=")(/.+?>)|', "$1$wgServer$2",$text );       # make image urls absolute
$text = preg_replace( '|<div\s*class=[\'"]?noprint["\']?>.+?</div>|s', '',$text ); # non-printable areas
$text = preg_replace( '|@{4}([^@]+?)@{4}|s', '<!--$1-->', $text ); # HTML comments hack$text    = preg_replace('|<table|', '<table border borderwidth=2 cellpadding=3 cellspacing=0', $text); # JM 2012-07-26$text = preg_replace('/<h5/', '<h6', $text);$text = preg_replace('/<h4/', '<h5', $text);$text = preg_replace('/<h3/', '<h4', $text);$text = preg_replace('/<h2/', '<h3', $text);$text = preg_replace('/<h1/', '<h2', $text);$text = preg_replace('|</h5|', '</h6', $text);$text = preg_replace('|</h4|', '</h5', $text);$text = preg_replace('|</h3|', '</h4', $text);$text = preg_replace('|</h2|', '</h3', $text);$text = preg_replace('|</h1|', '</h2', $text); } # end JM$ttext   = basename($ttext);$h1      = $notitle ? '' : "<center><h1>$ttext</h1></center>";
$html .= utf8_decode("$h1$text\n"); } }  Note the preg_replaces in-between the comments. BTW if you'd like to keep the page breaks you can replace $text = preg_replace('/<h1/', '<h2', $text);  by $text = preg_replace('/<h1/', '<!-- NEW PAGE --><h2', $text);  ## Problems with Htmldoc (1.9.0) Hello, With new version of htmldoc (as the one used by archlinux), I got all my content on one single Line (I mean all content of all pages on only one line !). To fix this, I've added, in phpbook.hook.php a "<body>" tag for the html content.$html = "<body>".$html."</body>"; I hope it may help other users ! Mathieu ## Is this extension actively maintained? I'm interested in porting this extension to use the wkhtmltopdf library, which does a superior job of HTML>PDF than HtmlDoc. Has anybody already done any work on this? Is the extension being actively maintained and developed? Andrujhon (talk) ## Maintainance Aran Dunkley seems to be off to Brazil - I just sent him a Facebook message to find out whether he's going to maintain the pdfbook extension. There seem to be a few modified versions out there already e.g. Please add any other link. I'm volunteering to setup a new maintained version on github if Aran doesn't want to continue to work on the svn version. -- Seppl2013 (talk) 15:54, 22 September 2013 (UTC) ## Got it working! I somehow got it working.. I'm not exactly sure which steps are the ones that helped, so here are some things I did: 1. I did not use the .php files listed here. Instead, I used the ones User:Seppl20 pointed to at github. 2. Installed HTMLDOC into the Apache (I'm using XAMPP) cgi-bin folder. 3. I don't know anything about Apache PATH whatevers, so I tried to follow the instructions here to open Apache's path. 4. Finally, I tried following HTMLDOC manual instructions for a little bit but I kept getting stuck with an empty .pdf. I went to my host computer and tried running the same change to the url (?title=Category:blah&action=pdfbook) and found that I was getting the missing LIBEAY32.DLL error. 5. Blindly copied/pasted the htmldoc.exe, LIBEAY32.DLL, MSVCR.dll, SSLEAT32.DLL files into every possible directory from the server down to the wiki. I can't be certain which folder was the one that needed it. I want to say it was in /htdocs/mywiki, but again... blind pasting everywhere. I read a lot of comments about about commenting out lines or adding in small amounts of text, but my final product did not entail any of that. Good luck! Hollymollybobolly (talk) 22:35, 20 December 2013 (UTC) ## Set href format=single if page is not a category (wgPdfBookTab = true) I modified the code that the format=single is used if the page is not a category. +++ PdfBook/PdfBook.hooks.php 2014-03-19 11:19:05.000000000 +0100 @@ -162,11 +162,20 @@ global$wgPdfBookTab;

if ( $wgPdfBookTab ) { -$actions['views']['pdfbook'] = array(
-                               'class' => false,
-                               'text' => wfMsg( 'pdfbook-action' ),
-                               'href' => $skin->getTitle()->getLocalURL( "action=pdfbook&format=single" ), - ); + if ($skin->getTitle()->isContentPage() ) {
+                               $actions['views']['pdfbook'] = array( + 'class' => false, + 'text' => wfMsg( 'pdfbook-action' ), + 'href' =>$skin->getTitle()->getLocalURL( "action=pdfbook&format=single" ),
+                               );
+                       }
+                       else {
+                               $actions['views']['pdfbook'] = array( + 'class' => false, + 'text' => wfMsg( 'pdfbook-action' ), + 'href' =>$skin->getTitle()->getLocalURL( "action=pdfbook" ),
+                               );
+                       }
}
return true;
}



## Compatability with MediaWiki v1.23

I'm currently on branch wmf/1.23wmf19 and I found the headings (h2, h3, etc) wouldn't show up correctly in the PDF. It turns out that the headings now contain some extra span tags that don't go well together with HTMLDOC. I wrote a hack in Perl to remove the unwanted elements. My PHP isn't up to par to implement it well in PdfBook so I hope someone will find this useful and implement it properly. Here are the Perl regexp's I used.

$content =~ s/<span class="mw-headline".*?>(.*?)<\/span>/$1/g;
$content =~ s/<span class="mw-editsection-bracket">(.*?)<\/span>//g;$content =~ s/<span class="mw-editsection">(.*?)<\/span>//g;


$text = preg_replace( '/<span class="mw-headline" id="(.*?)">(.*?)<\/span>/', "$2", $text );<code> after <code>$text    = preg_replace( "|<div\s*class=['\"]?noprint[\"']?>.+?</div>|s", "", $text ); // non-printable areas ## How hidding some items identified by css id or class when exporting to pdf ? I would like that the table of contents (of the wiki page) and some table identified by class "wikitable" dont appear in the pdf file. Thanks, Cheers Nicolas NALLET (talk) 13:49, 23 April 2014 (UTC) ## sidebar customization in MW >= 1.18$wgMessageCache is deprecated since 1.18. (the solution shown in a previous topic is no longer working)

Preferably the link should be dynamic, so that for single pages it would contain "&format=single", whereas for categories it need not.

I would like to have a pdf-file or two downloadable from the description page for this extension. If one gets a view from any selected category on the one side and the view of the resulting pdf on the other side, one has it much easier to get an idea how it works and what one can expect it to achieve. Maybe simple example with about 10 pages and a bigger one with sub-categories, pictures, tables and about 100 pages? --Manorainjan (talk) 14:46, 2 November 2014 (UTC)

## Missing $wgOut = setHTMLtitle-line in pdfbook.php According to http://www.mediawiki.org/wiki/Extension:PdfBook you should edit line 122 of PdfBook.php but my PdfBook.php (downloaded from the snapshot for release 1.23) has only 48 lines. Downloading for other branches give the same. Also noticed that version 1.1.0 is shown under Special:Version whereas the above mentioned link states that the latest release is 1.2.3. Has there been some hacking done? At least I got it working now. ## How to resize images in MediaWiki 1.24 with PdfBook 1.1.0, 2014-04-01 I have downloaded PdfBook v1.1.0 and created a PDF of a single page with a large image on it. The image is not resized to fit the PDF and is cut off. I tried following these instructions: http://www.mediawiki.org/wiki/Extension_talk:PdfBook#Hacks_to_change_PDF_output_.28v._0.6.29 but updating PdfBook.hooks.php instead of PdfBook.php as this appears to be since version 0.6 of the extension. However, it doesn't work. If I include the full line: $cmdext =  " --browserwidth 800 --titlefile $wgUploadDirectory/PDFBook.html";  the PDF cannot be opened and gives an error that it is either not supported or corrupted. If I set the line to: $cmdext =  " --browserwidth 800";


the PDF generates, but the image is not resized. Is there a way to have images resized in version 1.1.0?

### Fix for the image problem

Generally, all the images look larger than they are in the wiki page, and there are images that don't even fit. There is a third problem, all the images have an approx. 40px indent, so they are not in line with the text. Here is my fix, that I've been using for a long time:

Put the following code between these lines:

113:     $text = preg_replace( "|(<img[^>]+?src=\"$imgpath)(/.+?>)|", "<img src=\"$wgUploadDirectory$2", $text ); ....fix goes here.... 114: } 115: if($nothumbs == 'true' ) $text = preg_replace( "|images/thumb/(\w+/\w+/[\w\.\-]+).*\"|", "images/$1\"", $text );  And here is the fix: //If any image widther than this, the image will be resized to this size$max_image_size=650;

//By default, every image in the PDF has a 40px indent. If this is true, the images will be
//in line with the text
$remove_image_indent=true; //Generally, the images in the PDF are larger than it was in the wiki page. If this flag is true, //all the images, that width weren't modified due to it was larger than 'max_image_size' will be resized //with the percentage defined in 'resize_percentage' variable.$resize_small_images=false;
$resize_percentage=80; if ($remove_image_indent) {
$text = preg_replace( "|<dl><dd>(<a .+><img.+\/><\/a>)<\/dd><\/dl>|", "$1", $text ); } if ( preg_match_all("/<img[^>]+?width=\"([0-9]+)\"/",$text, $matches)) { for ($i = 0; $i < count($matches[1]); $i++) {$w=$matches[1][$i];

if (preg_match("/<img[^>]+?width=\"$w\"\s+height=\"([0-9]+)\"/",$text, $matches2)) {$h=$matches2[1]; if ($w > $max_image_size) {$w2=$max_image_size;$h2=round($h*($w2/$w));$text    = preg_replace( "|width=\"$w\"\s+height=\"$h\"|", "width=\"$w2\" height=\"$h2\"", $text ); } else if ($resize_small_images) {
$w2=$w*($resize_percentage/100);$h2=round($h*($w2/$w));$text    = preg_replace( "|width=\"$w\"\s+height=\"$h\"|", "width=\"$w2\" height=\"$h2\"", $text ); } } } }  ## Can't Generate PDF From Category (MediaWiki 1.24 with PdfBook 1.1.0) I tried to generate a PDF using a category as per this example:  http://www.foo.bar/wiki/index.php?title=Category:foo&action=pdfbook  but the PDF couldn't be opened with an error that the file isn't supported or is corrupted. This is the log output for the request  Start request GET /index.php/?title=Category:My%20Category&action=pdfbook HTTP HEADERS: HOST: xxx USER-AGENT: Mozilla/5.0 (Windows NT 6.1; WOW64; rv:36.0) Gecko/20100101 Firefox/36.0 ACCEPT: text/html,application/xhtml+xml,application/xml;q=0.9,*/*;q=0.8 ACCEPT-LANGUAGE: en-US,en;q=0.5 ACCEPT-ENCODING: gzip, deflate COOKIE: wikiUserName=Admin; base-viewstate=opened; flexiskin-viewstate=opened; wiki_session=ac690952f6cac677a345f2168db15abc; wikiUserID=1 VIA: 1.1 xxx:8080 (squid/2.6.STABLE21) X-FORWARDED-FOR: xxx, xxx CACHE-CONTROL: max-age=259200 CONNECTION: keep-alive [caches] main: EmptyBagOStuff, message: SqlBagOStuff, parser: SqlBagOStuff [caches] LocalisationCache: using store LCStoreDB Fully initialised Connected to database 0 at localhost Connected to database 0 at localhost MessageCache::load: Loading en... got from global cache Title::getRestrictionTypes: applicable restrictions to Main Page are {edit,move} [ContentHandler] Created handler for wikitext: WikitextContentHandler User: cache miss for user 1 User: loading options for user 1 from database. User: logged in from session Unstubbing$wgLang on call of $wgLang::_unstub from ParserOptions::__construct User: loading options for user 1 from override cache. [deprecated] Use of wfMsg was deprecated in MediaWiki 1.21. [Called from PdfBookHooks::onUnknownAction in /var/www/mediawiki-1.24.1/extensions/PdfBook/PdfBook.hooks.php at line 18] [deprecated] Use of wfMsgReal was deprecated in MediaWiki 1.21. [Called from wfMsg in /var/www/mediawiki-1.24.1/includes/GlobalFunctions.php at line 1479] [deprecated] Use of wfMsgGetKey was deprecated in MediaWiki 1.21. [Called from wfMsgReal in /var/www/mediawiki-1.24.1/includes/GlobalFunctions.php at line 1577] DatabaseBase::query: Writes done: INSERT INTO logging (log_id,log_type,log_action,log_timestamp,log_user,log_user_text,log_namespace,log_title,log_page,log_comment,log_params) VALUES (NULL,'X') Unstubbing$wgParser on call of $wgParser::preprocess from PdfBookHooks::onUnknownAction Parser: using preprocessor: Preprocessor_DOM LoadBalancer::reuseConnection: this connection was not opened as a foreign connection Request ended normally  ### Fixed Itself I tried again, and this time the PDF generated properly. I didn't make any changes to the extension config, so not sure what happened. ## How it worked for me Hi eveybody; I post this topic because this extension is the only of the kind actually working on my WAMP server. This is how : I use a mediawiki 1.25.2 on a local server (as personal wiki) with Wampserver on Windows 10. installed htmldoc 1.8.28, binary from www.paehl.com according to his setup.txt included in the download. note : you must install in c:\Program Files\HTMLDOC because this path in harcoded somewhere. installed PdfBook 1.1.0 set$wgPdfBookFont = "Times New Roman"; otherwise some characters (eg single quotes) are not rendered

And it works well, even for category export. Images are rendered inline regardless of their position in wiki text. Thanks to the developpers Phcalle (talk) 11:38, 5 April 2016 (UTC)

## Fullurl parser function link automatically to every category page

It is written in Extension:PdfBook as : "In order to include Fullurl parser function link automatically to every category page, add it to the Mediawiki:Categoryarticlecount page"

I tried this . But didnt work.

## Headings not printed to PDF

I'm running mediawiki 1.25 on a linux VM. I installed htmldoc-1.8.28-4.el7.x86_64.rpm. Put PdfBook in the extension folder and updated LocalSettings.php with the line of code specified in the installation instructions. However, when I print a PDF, all the headings are omitted. Anything text within <h></h> tags is left out. The text from within the headings does appear in the Table of Contents but that's the only place it is visible.

Does anyone have any troubleshooting ideas for this problem?

## Blank page/Error 500 when exporting

I have an issue when trying to export (whether from a category or a single page) : the page is totally blank on Mozilla, and returns an error 500 on IE. I installed the prerequisite HTMLDOC and the extension again, and I still have the problem. The Mediawiki version is 1.19, on a red hat OS. Has anyone ever met this problem please ?

Hi, an HTTP 500 error ("internal server error") will usually leave some info in your web server logs. You may find something like a PHP stack trace there that provides more information about the cause of the error. --Remco de Boer 18:52, 8 October 2016 (UTC)

Thanks a lot. We have this apache error

[Mon Oct 10 14:11:29 2016] [error] [client <ip>] PHP Fatal error:  Call to undefined method WikiPage::getContent() in <apache_URL>/extensions/PdfBook/PdfBook.hooks.php on line 74, referer: https://<server>/<wiki>/index.php/Accueil


I finally succeed by using an older version of PdfBook. No more PHP error.

## Article limit ?

Hello, I just installed the PdfBook extension on my Mediawiki, and it globaly works fine (export only one article, export articles from a category...). But when I tried to export a greatest category (over 180 articles approximately), I get this error : Fatal error: Maximum execution time of 30 seconds exceeded in <Wiki_directory>/includes/parser/Preprocessor_DOM.php on line XXXX (the line differs each time). Is there a way to solve this, please ? Thanks in advance

## Missing PDFBook.php?

I am attempting to install the master branch version on Mediawiki 1.27, running from Turnkey Linux appliance. I've installed HTMLDoc using apt-get, with no errors. I go to download PDFBook and there seems to be no PDFBook.php file. Resulting in nothing working when I add lines to LocalSettings.php. Any advice on where I go to get this file? Alternately, I tried the 1.24 branch, and while there is a PDFBook.php included in that, it does not work in my MediaWiki. Thanks in advance!

## Printing multiple copies of specific pages?

Not sure how to accomplish this, but say a category has a dozen various pages I want to turn into a PDF, but I want 3 copies of one of the pages? I work at a site that has a fair amount of hand written paperwork used daily, with some pages printed out 6-8 times at once for different stations. I'd like to automate this printing somehow.

Jhollinden (talk) 13:55, 28 March 2017 (UTC)

## Images too large (not resized)

I have an issue with the images (whatever the type, JPG, PNG...) : they do not fit to the page and the big ones are cut on the right. I just found that when the option "--browserwidth" is set at a high value, large images are entirely in the output PDF, but the smallest images are now too small because of the reduction. Does someone have an idea please ?

## Problem with table borders

The borders of the tables are not visible in the output PDF (only the values in the cases). Is there an option or something to modify ? Thanks.

## Why use move from GitHub to gitlab?

Hi, why did you delete your GitHub source code? Users were using it and not all of us like using gitlab. I understand you may have moved to gitlab due to microsoft buying GitHub. But why? Microsoft is a very different company under there new ceo then they were before. Please give them a chance before giving up on GitHub. Alot of submodules broke now (there was little warning).

Also the gitlab project is private so you have to login to view it. Paladox (talk) 17:08, 6 June 2018 (UTC)

I have deleted everything from Github as a form of protest about what has happened. I understand that others may not feel the same way as me or not as strongly as me, but Organic Design will not have it's code managed by a Microsoft product. Feel free to fork it and maintain a repo in Github if you like. --Nad (talk) 19:56, 6 June 2018 (UTC)
p.s. Sorry it wasn't supposed to be private, it's public now. --Nad (talk) 20:01, 6 June 2018 (UTC)
I don't know why that would be, it's public now and I can got to the extensions repo and download a zip or clone it with no login required... --Nad (talk) 21:28, 6 June 2018 (UTC)
I am getting the same note. Admittedly I did not try to clone and see if at least this works. Cheers --[[kgh]] (talk) 21:30, 6 June 2018 (UTC)
does this work https://gitlab.com/OrganicDesign/extensions/tree/master/MediaWiki/PdfBook/ logged out? Paladox (talk) 21:31, 6 June 2018 (UTC)
Also per Microsoft, they are keeping GitHub separate so it will still be the same old GitHub (open) :). Paladox (talk) 21:31, 6 June 2018 (UTC)

Well, the user name was changed from OrganicDesign to Aranad. Fixed now. Cheers --[[kgh]] (talk) 21:37, 6 June 2018 (UTC)
The move to Gitlab would have been an ideal change to separate the extensions you created. When downloading a zip etc. we are still getting all software you created. Well, its accessible which is the main thing in the end. --[[kgh]] (talk) 21:41, 6 June 2018 (UTC)
Yeah I thought about that, but I still need to do a lengthy process in order to preserve the history. I have made some progress though as I've got the process documented for how to separate a specific folder out and nuke everything else including its history, so I think I'll be splitting them all out soon :-) --Nad (talk) 21:53, 6 June 2018 (UTC)
Great to read this. I guess this will make things much easier for novice users of your software. Well,for others, too. :) Cheers --[[kgh]] (talk) 21:56, 6 June 2018 (UTC)
I've just tested the process and after few tweaks got it working, I'll update the links again. here is PdfBook by itself with its history intact, I'll do the other most used extensions soon. --Nad (talk) 22:37, 6 June 2018 (UTC)
That is really cool. Thanks a lot for doing this! Cheers --[[kgh]] (talk) 06:51, 7 June 2018 (UTC)

## Some problems, seem doesn't work...

Hi guys, I have some problems wit this extension. I installed it, installed htmldoc, buut if I visit a link such as:

it returns:

Why?

The link format is wrong the first & should be a ? and to be more independent of server configuration you should really use index.php?title=PAGENAME&action.... --Nad (talk) 12:57, 18 June 2018 (UTC)

## Print as PDF tab only available for logged-in users

Hello,

I am migrating a Mediawiki installation from version 1.37 to 1.31 and so far so good, apart from one small thing regarding this extension.

I have enabled the tab "Print as PDF", but it only appears if I am logged-in. In my company we want everyone to being able to export articles to PDF without having to log-in. We, the IT department, are pretty much the only editors of the Wiki.

Is this a standard behaviour that the link is only available for logged-in users now or have I something misconfigured? In Mediawiki 3.27 with an older version of this extension, it used to work as desired.