Extension talk:PdfHandler

About this board

Edit description

→ PdfHandler Talk Archive

Previous page history was archived for backup purposes at Extension talk:PdfHandler/LQT Archive 1 on 2015-06-25.

Start a new topic

"0 × 0 pixel"

One comment • 20:08, 6 April 2024 9 days ago

1

School4schools (talkcontribs)

Have all packages installed (using AlmaLinux v8.9.0) along with poppler-utils.

Software:

MediaWiki  1.41.0
PHP        8.1.27 (fpm-fcgi)
ICU        69.1
MariaDB	   10.6.17-MariaDB
Lua        5.1.5
Pygments   2.16.1

Output "which gs convert pdfinfo pdftotext" returns:

/bin/gs
/bin/convert
/bin/pdfinfo
/bin/pdftotext

PDFs uploaded still read "0 x 0 px". However on the server, using

/pdfinfo "path/filename.pdf"

shows the file size ("612 x 79s pgs (letter)

I have run suggested maintenance scripts ("refreshImageMetadata.php -f" and "rebuildImages.php")

What am I doing wrong?

Reply Edited 20:08, 6 April 2024 9 days ago

Reply to ""0 × 0 pixel""

/bin/bash: Permission denied

One comment • 07:11, 2 April 2024 14 days ago

1

Bcmpinc (talkcontribs)

I'm getting an error during thumbnail generation. This is the error when viewing the file page: Fout bij het aanmaken van de miniatuurafbeelding: sh: 1: /bin/bash: Permission denied

Something goes wrong in doTransform in PdfHandler.php, but I don't understand what. I suspect the command line gets mangled somewhere. I've replaced $err = wfShellExecWithStderr( $cmd, $retval ); in that function with exec($cmd, $err, $retval);. This seems be a functional workaround.

Reply Edited 07:11, 2 April 2024 14 days ago

Reply to "/bin/bash: Permission denied"

Installing xpdf-utils

3 comments • 23:19, 25 March 2024 21 days ago

3

TelePointHistory (talkcontribs)

I am trying to use PdfHandler to show thumbnails but know I don't have the pre-requisites.

Running the command in SSH: which gs convert pdfinfo pdftotext

shows that I have ghostscript but no packages for xpdf-utils

/bin/gs

/bin/convert

/usr/bin/which: no pdfinfo etc

/usr/bin/which: no pdftotext etc

Looking at this page https://www.xpdfreader.com/download.html I don't know what file to download or where to put it once it's downloaded. I am on Siteground hosting with WM 1.35.3 if that makes a difference. I'm hoping someone can give me a basic rundown for how it's meant to work. thanks.

Reply 02:04, 26 September 2020 3 years ago

School4schools (talkcontribs)

Did you ever get this extension working w Xpdf-utils / XpdfReader installation?

Reply 23:19, 25 March 2024 21 days ago

Kghbln (talkcontribs)

Siteground has to install this for you on the server I guess. Best way is to contact their support.

Reply 09:31, 26 September 2020 3 years ago

Reply to "Installing xpdf-utils"

No PDF/thumbnail, issue executing pdfinfo/pdftotext, Windows Server 2012 R2, IIS 8.5, MW 1.31

6 comments • 10:39, 27 November 2023 4 months ago

6

Tommyheyser (talkcontribs)

MW 1.31.1 running on Windows Server 2012 R2 IIS 8.5

I'm getting the following error (from $wgDebugLogFile output log file) for all execution of pdfinfo and pdftotext.

[exec] Error running "pdfinfo" "-enc" "UTF-8" "-meta" "C:/inetpub/wwwroot/w/images/f/f4/Phone_List.pdf": 'pdfinfo" "-enc" "UTF-8" "-meta" "C:' is not recognized as an internal or external command, operable program or batch file.

I'm not sure if this is the result of the new Shell framework introduced in 1.30, Manual:Shell framework, which replaces wfShellExec(). The debug log line before the error is:

[exec] MediaWiki\Shell\Command::execute: "pdfinfo" "-enc" "UTF-8" "-meta" "C:/inetpub/wwwroot/w/images/f/f4/Phone_List.pdf"

Reply Edited 00:06, 7 November 2018 5 years ago

Tommyheyser (talkcontribs)

This seems to be related to this discussion Topic:Ugtwpe4lyuly6q98 regarding PHP, Windows and Shell.

Reply 00:09, 7 November 2018 5 years ago

Tommyheyser (talkcontribs)

In case someone else is having this issue of not seeing PDF and is running MW 1.31 on Windows Server 2012 R2.

I added the path to pdfinfo.exe and pdftotext.exe to System variables path (mine was C:\Program Files\xpdf-tools-win-4.00\bin64).
Then, I edit {mediawiki install path}/extensions/PdfHandler/includes/PdfImage.php function retrieveMetaData.

a. Replacing:

$cmdMeta = [
$wgPdfInfo,
'-enc', 'UTF-8', # Report metadata as UTF-8 text...
'-meta',         # Report XMP metadata
$this->mFilename,
];

with

$cmdMeta = "pdfinfo.exe -enc UTF-8 -meta " . $this->mFilename;

b. Replacing

$cmdPages = [
$wgPdfInfo,
'-enc', 'UTF-8', # Report metadata as UTF-8 text...
'-l', '9999999', # Report page sizes for all pages
$this->mFilename,
];

with

$cmdPages = "pdfinfo.exe -enc UTF-8 -l 9999999 " . $this->mFilename;

c. Replacing

$cmd = [ $wgPdftoText,  $this->mFilename, '-' ];

with

$cmd = "pdftotext.exe " . $this->mFilename;

It's a bit of a hack, but it works. This should last until the issue is properly fixed.

Reply Edited 22:15, 7 November 2018 5 years ago

173.77.3.157 (talkcontribs)

Thank you for the information. I have modified the code further to avoid pdfinfo and pdftotext from hanging. If anyone needs PdfHandler in Windows Server - you can download the modified extension at https://github.com/SeongMoon/MW-v1.31.1-PdfHandler-Windows-Server

Reply 16:15, 26 January 2019 5 years ago

TomRamm (talkcontribs)

Since the source code has changed considerably in the meantime, this approach no longer works. I have done the following to make it work for me:

created a new file in the scripts subfolder

scripts/retrieveMetaData.cmd

@echo off

if NOT "%PDFHANDLER_INFO%" == "" call:runInfo
if NOT "%PDFHANDLER_TOTEXT%" == "" call:runToText

EXIT /B %ERRORLEVEL%

:runInfo
	call "%PDFHANDLER_INFO%" -enc UTF-8	-meta file.pdf > meta
	call "%PDFHANDLER_INFO%" -enc UTF-8 -l 9999999 file.pdf > pages
EXIT /B 0

:runToText
	call "%PDFHANDLER_TOTEXT%" file.pdf - > text
	echo %ERRORLEVEL% > text_exit_code

EXIT /B 0

in includes/PdfImage.php In the function retrieveMetaData, I changed the call of the script depending on the operating system. Under Linux the original code is used, under Windows the .cmd script is called instead of the .sh script, and the script is not passed as a parameter but directly.

if (strtoupper(substr(PHP_OS, 0, 3)) === 'WIN') {
	# 'This is a server using Windows!'
	$result = $command
		->params( 'scripts/retrieveMetaData.cmd' )
		->inputFileFromFile(
			'scripts/retrieveMetaData.cmd',
			__DIR__ . '/../scripts/retrieveMetaData.cmd' )
		->inputFileFromFile( 'file.pdf', $this->mFilename )
		->outputFileToString( 'meta' )
		->outputFileToString( 'pages' )
		->outputFileToString( 'text' )
		->outputFileToString( 'text_exit_code' )
		->environment( [
			'PDFHANDLER_INFO' => $wgPdfInfo,
			'PDFHANDLER_TOTEXT' => $wgPdftoText,
		] )
		->execute();
} else {
	# 'This is a server not using Windows!'
	$result = $command
		->params( $wgPdfHandlerShell, 'scripts/retrieveMetaData.sh' )
		->inputFileFromFile(
			'scripts/retrieveMetaData.sh',
			__DIR__ . '/../scripts/retrieveMetaData.sh' )
		->inputFileFromFile( 'file.pdf', $this->mFilename )
		->outputFileToString( 'meta' )
		->outputFileToString( 'pages' )
		->outputFileToString( 'text' )
		->outputFileToString( 'text_exit_code' )
		->environment( [
			'PDFHANDLER_INFO' => $wgPdfInfo,
			'PDFHANDLER_TOTEXT' => $wgPdftoText,
		] )
		->execute();
}

--~~~~

Reply 10:39, 27 November 2023 4 months ago

Mwgbell (talkcontribs)

I had a similar problem using ImageMagick 7.1.0-19 Q16-HDRI with MedaiWiki 1.37.1 on Windows 11. To fix it, in extensions\PdfHandler\includes\PdfHandler.php

Change this line:

$cmd .= " | " . wfEscapeShellArg(

$wgPdfPostProcessor,

"-depth",

"8",

"-quality",

$wgPdfHandlerJpegQuality,

"-resize",

$width,

"-",

$dstPath

);

To this: (i.e. move the "-" to the first thing after the $wgPdfPostProcessor, line):

$cmd .= " | " . wfEscapeShellArg(

$wgPdfPostProcessor,

"-",

"-depth",

"8",

"-quality",

$wgPdfHandlerJpegQuality,

"-resize",

$width,

$dstPath

);

Reply 17:32, 25 January 2022 2 years ago

Reply to "No PDF/thumbnail, issue executing pdfinfo/pdftotext, Windows Server 2012 R2, IIS 8.5, MW 1.31"

Previous/Next Page functionality

One comment • 21:58, 6 September 2023 7 months ago

1

47.186.29.164 (talkcontribs)

On the file upload page for the PDF I see navigation buttons for next/previous pages, but I see no such navigation buttons on the page where the file is displayed. What am I doing wrong?

Reply 21:58, 6 September 2023 7 months ago

Reply to "Previous/Next Page functionality"

Direct linking to PDF page, When clicking to direct media

2 comments • 13:38, 17 July 2023 8 months ago

2

Gmillerd (talkcontribs)

Does anyone have a modification of the extension to make click of the PDF when a page is specified to go to that page?

/mediawiki/index.php?title=File:Filename.pdf&page=25

to the following, to make the browser skip to the specified page?

/images/0/0b/Filename.pdf#page=25

I am able to do it in javascript, but the PHP evades me.

$("#file.fullImageLink").find("a:first").each(function() {
    $(this).attr("href", $(this).attr("href") + "#page=" + getUrlParameter("page"));
});

Edited 21:19, 4 July 2018 5 years ago

212.59.13.226 (talkcontribs)

Use # instead of ...&page=25

08:07, 26 November 2018 5 years ago

No PDF images displayed

4 comments • 16:44, 2 July 2023 9 months ago

4

Darlig Gitarist (talkcontribs)

PDFHandler extension is supposed to allow viewing of pdf files. However, this does not appear to be working as advertised.

We've gone through the troubleshooting area of MediaWiki for this plugin and double-checked the paths to PDF converters. We re-ran the maint scripts for images and image meta. We checked the logs.

There is no indication of errors other than the images not showing up.

MediaWiki 1.35.5
PHP 7.4.27 (fpm-fcgi)
MySQL 5.7.37-0ubuntu0.18.04.1-log
ICU 60.2
Lua 5.1.5
PDF Handler – (16eda4b) 20:58, 2022 January 23

Any help or suggestions would be appreciated.

Reply Edited 21:55, 12 February 2022 2 years ago

Cboltz (talkcontribs)

Wild guess: Some Linux distributions (for example openSUSE) have disabled rendering of PDF files in their default ImageMagick config because it has been a steady source of security issues (for example "ImageTragick"). In openSUSE, you'd need to install the ImageMagick-config-7-upstream package to enable rendering of PDF files.

Note: I don't know if Ubuntu did something similar with the ImageMagick config.

If unsure, test if converting a PDF to an image in the shell works: convert foo.pdf foo.png

Reply 16:58, 14 February 2022 2 years ago

Drewsaur (talkcontribs)

I have done this, and still can't get the extension to work. Any other ideas?

Reply 15:06, 14 January 2023 1 year ago

Michele.Fella (talkcontribs)

..check /etc/ImageMagick-<your_version>/policy.xml

if <policy domain="coder" rights="none" pattern="PDF" /> means convert is not allowed to perform its job..

you might change rights="read | write"

but you should be aware and responsible of the security risks this might bring (as Cbolts mentioned)

Reply 16:44, 2 July 2023 9 months ago

Reply to "No PDF images displayed"

PDFHandler not working. Still displays File Link

3 comments • 16:43, 2 July 2023 9 months ago

3

199.27.199.51 (talkcontribs)

PDFhandler is confirmed installed on Special:Versions, which returns all the required directories, and no extensions could be interfering with the install. What's the issue?

Reply 22:26, 20 December 2022 1 year ago

Drewsaur (talkcontribs)

I am having this issue too. I have changed the settings in ImageMagick so that PDFs are able to be converted; verified this at the command line; verified that all 4 related utilities are working at the command line; run all the maintenance scripts; and...nothing.

Reply 15:05, 14 January 2023 1 year ago

Michele.Fella (talkcontribs)

..check /etc/ImageMagick-<your_version>/policy.xml

if <policy domain="coder" rights="none" pattern="PDF" /> means convert is not allowed to perform its job..

you might change rights="read | write"

but you should be aware and responsible of the security risks this might bring (check post below from Cbolts)

Reply 16:43, 2 July 2023 9 months ago

Reply to "PDFHandler not working. Still displays File Link"

Timeout

One comment • 22:35, 23 January 2023 1 year ago

1

87.165.252.36 (talkcontribs)

Fehler beim Erstellen des Vorschaubildes: limit.sh: timed out executing command "('/usr/bin/gs' '-sDEVICE=jpeg' '-sOutputFile=-' '-sstdout=%stderr' '-dFirstPage=1' '-dLastPage=1' '-dSAFER' '-r150' '-dBATCH' '-dNOPAUSE' '-q' '/opt/mediawiki/images/7/7e/A.K.2023.pdf' | '/usr/bin/convert' '-depth' '8' '-quality' '95' '-resize' '800' '-' '/tmp/transform_96151e2ec90e.jpg')"

I already changed some settings but obviously not the right ones. What to do to get it work on a PDF file with A LOT of pixels in each direction?

Reply 22:35, 23 January 2023 1 year ago

Reply to "Timeout"

Any workaround for phabricator T220680/T211754

One comment • 15:30, 14 October 2022 1 year ago

1

Pspviwki (talkcontribs)

I tried to achieve the same functionality like for example on the page in the commons File:PDF metadata.pdf having pdf browser in the single wiki page where it is possible to browse pdf file page by page by giving page number and go by using gallery tag and pdf handler. Unfortunately, it does not work it always shows only the first page, the end result was T220680 for pdf handler and gallery that went to T211754. Suggested hack from Russian wiki does not work. It really is blocking, the functionality works on the file page, it does not work using gallery tag but if it works on the file page, how to achieve it, is there any work around? Various PDFembed extensions are unusable. Thanks.

Reply 15:30, 14 October 2022 1 year ago

Reply to "Any workaround for phabricator T220680/T211754"