Jump to content

Talk:Reading/Web/PDF Functionality/2019

Add topic
From mediawiki.org

About giving feedback

Update: (15 July 2019) We’ve launched the new PDF renderer. We’re looking at feedback, but haven't so far seen any significant issues. We might incorporate some suggestions, but want to note that this is not an ongoing project with continuous development. In other words, now that it's deployed and proven to work, the new renderer is entering maintenance mode. This page won’t be abandoned, but it could take a while before anyone reacts, simply because everyone's got so much else to do.  

In terms of books, we've left it in the hands of volunteer developers and PediaPress. We'll be glad to reach out to them with questions, but we're not planning any involvement in terms of the technical implementation.

need help

[edit]

our students needs to know many things...........pls quickly recover the download system.................pls.pls.......................... 103.87.234.9 (talk) 10:37, 7 January 2019 (UTC)Reply

You should be able to download PDFs for individual articles now. That might help your students get started.
But I do agree that the book creator needs stability checks, chapter headings added and rolling out ASAP. Cleaner presentation can follow later. Steelpillow (talk) 11:38, 7 January 2019 (UTC)Reply
Hi,
with
mediawiki2latex -u https://en.wikipedia.org/wiki/Book:River_martin -o out.pdf -k
you can get PDFs of books. If you tell me which ones you need I can generate them for you next weekend. Yours Dirk Dirk Hünniger (talk) 18:39, 7 January 2019 (UTC)Reply

Upgrade 1.20->1.31.1 Create a Book says "Book Creator is undergoing changes" - Confused

[edit]

I posted this to the project support desk page and did not get a reply so I'm trying here.

I am upgrading my MediaWiki to

MediaWiki  1.31.1

PHP        7.2.10 (apache2handler)

MariaDB    10.3.11-MariaDB

When I try and start the book creator I get a page that says:

"Book Creator is undergoing changes"

However, that page links to (here):

https://www.mediawiki.org/wiki/Reading/Web/PDF_Functionality

for details.  This page seems to indicate that the book generator is supposed to be operational, but I cannot tell so I don't know what should work and what does not.

What is the current status?  Is this issue that the "Download as PDF" is not working?  Rather creating a PDF via PediaPress should work?   Neither is working for me, but I am familiar with the process as I have extensively used book creation via "Download as PDF" in the past.

Thank you.  Brent 96.3.195.68 (talk) 14:10, 9 January 2019 (UTC)Reply

Uh, good question. I don't think the book-to-PDF creation should have been included in 1.31.1 but normal PDF creation should, but I'm guessing here. Do you know, @OVasileva (WMF)? Johan (WMF) (talk) 14:37, 9 January 2019 (UTC)Reply
The information in the article does seem to be unclear. PediaPress are currently involved in two different ways:
1. You can create a book, upload it to the PediaPress web site and order print-on-demand physical copies.
2. PediaPress are also rewriting Wikipedia's own PDF book renderer, and while they are doing this it is not possible to create or print a PDF softcopy wikibook. This is the main subject of the recent update posts.
But I do not know what functionality is included in the 1.31.1 build.
Hope this helps a little. Steelpillow (talk) 14:52, 9 January 2019 (UTC)Reply
The "Download as PDF" is greyed out (it shows on the page but unavailable for use).
The "Preview with PediaPress" is "available". So I gather it is supposed to work?
If there is a way for me to get on the inside track for enabling/testing an alpha or beta "Download as PDF", please let me know.
Thank you for your replies. Brent
p.s. I might suggest that the www.mediawiki.org/wiki/Reading/Web/PDF_Functionality be more clear about what users can expect to work/not-work in the application as-of a specific MediaWiki release. Brentl999 (talk) 15:23, 9 January 2019 (UTC)Reply
The "Preview with PediaPress" is the proprietary print-on-demand option, which is fully working and so is not greyed out.
There is an "alpha"-ish test build of the open-source "Download as PDF" book creator, though it does not yet have the Chapter headings wrapper or anything, at https://pediapress.com/collector Steelpillow (talk) 16:06, 9 January 2019 (UTC)Reply
That's very much aimed at users of the Wikimedia wikis, yes. I'll see what we can figure out. Johan (WMF) (talk) 15:31, 9 January 2019 (UTC)Reply
I see there is more discussion about this at Talk:Reading/Web/PDF Functionality/2018/12#h-New_Render_Server_for_PDF_generation-2018-12-19T17:23:00.000Z. (www.mediawiki.org/wiki/Talk%3AReading/Web/PDF%20Functionality/2018/12#h-New_Render_Server_for_PDF_generation-2018-12-19T17%3A23%3A00.000Z)
I'm just pasting this here so if someone is reading this thread in the future, hopefully, it will save them time. Brentl999 (talk) 17:56, 9 January 2019 (UTC)Reply
with http://mediawiki2latex.wmflabs.org/ you can get a PDF Dirk Hünniger (talk) 18:19, 9 January 2019 (UTC)Reply
Yes I found it! Thank you Dirk :) Brentl999 (talk) 18:29, 9 January 2019 (UTC)Reply
Just to close the loop on this thread in case someone else stumbles across this. My solution ultimately landed on getting MediaWiki 1.27 up with a version of Collection extension supporting book creation operational. Versions of MediaWiki after 1.27 do not appear to be compatible with the Collection extension that supports book creation. Brentl999 (talk) 22:55, 28 February 2019 (UTC)Reply
Hi Brent,
nice to hear that you found a solution. But consider that 1.27 will be end of life in june 2019. Which is in four month from now.
See
https://www.mediawiki.org/wiki/Version_lifecycle
Yours Dirk Dirk Hünniger (talk) 06:40, 1 March 2019 (UTC)Reply
Hi all, apologies for the late reply. Unfortunately due to low usage, we will only be supporting PDF book creation via Pediapress for Wikimedia projects in the future, which is the renderer that Pediapress are currently working on. That said, the book creation process of the collections extension (everything outside the actual PDF download) are still supported and functional, as is the PDF creation from individual articles. It was a difficult decision for us to make, but fixing the book creator after retiring the OCG rendering service proved to be very complex and to require a lot of technical support in the future. As the usage of the feature was very low, we decided to continue providing the functionality on a smaller scale, and focus on rendering individual PDFs of articles instead. OVasileva (WMF) (talk) 10:08, 11 March 2019 (UTC)Reply
Hi,
thanks for the update. I feel a bit honoured that I do provide a feature ( http://mediawiki2latex.wmflabs.org/ ) in my free time which needs so much programmer time that the WMF decided that they cannot afford to implement it, especially since WMF got a budget of more than $90 million. Furthermore I implemented significant parts of the software when I was shaving cows in a cowshed for approx 5 EUR / hour and could not find any other job.
As a physicist this makes me laugh like looking a measured data that does surely not resemble reality. If you see something like that, something with the way you set up the experiment or with the way you analysed the data has gone terribly wrong. So yes the prophecy of the sect of purely functional programmers is true.
Quoting from the paper http://www.cse.chalmers.se/~rjmh/Papers/whyfp.pdf form 1984 "Functional programmers argue that there are great material benefits - that a functional programmer is an order of magnitude more productive than his conventional counterpart, because functional programs are an order of magnitude shorter."
Yours Dirk Dirk Hünniger (talk) 18:38, 11 March 2019 (UTC)Reply
When I learn that a book conversion which used to take five minutes on Wikipedia takes Dirk's system a time measured in hours, I wonder how much of that is down to hardware, how much to the chosen programming language, and how much to the code design. Maybe one day Dirk can install the forthcoming Wikimedia solution and see how fast it runs? Steelpillow (talk) 19:16, 11 March 2019 (UTC)Reply
I think its because I am using http to get the data and I only do one request at once since the servers might not respond otherwise. So you could get a significant speedup if you connected to the database directly. But there is one large amount of costs which is the compilation with LaTeX, which has to be done 4 time in order to get the references right and that takes a least 20% of the runtime, likely more, so it will still be hours for large books, independent of the programming language and hardware used, in particular you can not parallelize the LaTeX run. You can get the LaTeX source from the mediawiki2latex web server and do your own measurements. You could do it without LaTeX, but you will not get the typographic quality that way. Dirk Hünniger (talk) 22:21, 11 March 2019 (UTC)Reply
so I consulted some documentation. The larges book I compiled was roughly 5000 pages it is here
https://de.wikipedia.org/wiki/Benutzer:M2k~dewiki/B%C3%BCcher/Ausgew%C3%A4hlte_Beitr%C3%A4ge_und_Bearbeitungen
and here
https://drive.google.com/file/d/1SA6TEKWrdpXAxDyHZe-umBa2cJ5Ya77X/view?usp=sharing
it took about 9 hours to compile. From other documentation I found that 31% of the runtime are due to the latex compile step on averige.
So such a book will take about 3 hours to compile at minimum, independent of any software use to prepare the LaTeX source. Dirk Hünniger (talk) 22:40, 11 March 2019 (UTC)Reply
Perhaps that LaTeX processing overhead is why the Proton project tried to do it with html/mathml via headless Chrome? That was the mistake - it proved the wrong core to build books on. Even the single-article renderer is still warning of compositing problems. With hindsight, refreshing OCG would have been a faster and lower-risk strategy (evolution not revolution, as they say). It will be interesting to see how the PediaPress code performs. Steelpillow (talk) 08:40, 12 March 2019 (UTC)Reply

Book Creation -> Cached Books

[edit]

Hi,

I could generate PDF versions of all community maintained books in the English Wikipedia and store them in a cloud accessible with sftp. I could update each PDF once a year. We could link form the book template to the cloud with lua. Are you interested?

Yours Dirk Dirk Hünniger (talk) 17:46, 9 January 2019 (UTC)Reply

So, I started a minimal setup now, 5 years old dual core laptop next to my fridge, with one process at a time. The first three resulting pdfs are available here https://drive.google.com/drive/folders/17g5Ey6jauKd3CLMDNBOnV3RYKcJN0QZu?usp=sharing . I need about 1 hour per pdf and each has got a size of 20 MByte on average. Since I got roughly 6000 pdf to make. This will take about 250 days an use about 120 GByte of disk space. Dirk Hünniger (talk) 13:11, 2 June 2019 (UTC)Reply
Sounds like a great candidate for a rapid grant for a dedicated faster computer. Sj (talk) 16:55, 10 June 2019 (UTC)Reply
I think the computer is not so much of a problem at the moment. I can afford the necessary hardware and I am quite happy if I don't have to do more paperwork. But what I actually think I need is a community decision on the English Wikipedia that says that I can run a bot to upload the pdfs on the English Wikipedia. I started the discussion here: https://en.wikipedia.org/wiki/Wikipedia:Village_pump_(idea_lab) . I think the 20 PDF files I generated up to now are enough to come to a decision on that. Dirk Hünniger (talk) 17:21, 10 June 2019 (UTC)Reply

MediaWiki2LaTeX rendering Server for large documents

[edit]

Hi,

during the weekend I was able to reduce the memory consumption of mediawiki2latex significantly. I set up a new server with 4 hours max runtime per request. Now nearly every community maintained book on the english wikipedia should compile.

http://mediawiki2latex-large.wmflabs.org

Yours Dirk Dirk Hünniger (talk) 17:16, 14 January 2019 (UTC)Reply

Scam.

[edit]

These 'developments' are scams. The head developer is affiliated with pediapress.com (a for profit company).

The longer these "developments" take place, the more people will use their paid services!

Conflict of interest. No free stuff, people will pay!

Dear pediapress.com, tell me that your sales does not quadruple since the pdf service has shut down?

Corrupt. Aliasabuhanifah (talk) 14:26, 19 January 2019 (UTC)Reply

That is childishly silly. The project spent several years failing badly, before PediaPress offered to help us out. The new code is open licensed. By doing this they are actually reducing their potential to make money off their PoD service. They are good people. To be perfectly clear to you, I have no business relationship with PediaPress, other than buying a few printed books off them. I hope they did make a little beer money! Steelpillow (talk) 16:07, 19 January 2019 (UTC)Reply
Hi,
I kind of see that there is conflict of interest problem with that too. And mixing computer science with w:Token economy seems generally a bad choice to me. So I keep http://mediawiki2latex.wmflabs.org/ updated to provide an alternative export to pdf epub and odt fully open source and available on Debain. It will be interesting to see how both projects will evolve in the future.
Yours Dirk Dirk Hünniger (talk) 20:30, 19 January 2019 (UTC)Reply
a. Concepts like "assume good faith" and "civility" are at the core of what we do for a reason. Please don't poison the discussion climate. It is perfectly possible to discuss potential conflict of interest in a way that is in line with Wikimedia norms and expected behavior.
b. A fair chunk of the development time lies at the feet of the WMF, when the old solution was breaking down, the new renderer couldn't effectively handle collections and the Foundation, looking at the number of people who were using the books-to-PDF solution, couldn't justify taking people away from other projects to work on it.
c. I suspect you vastly overestimate the long-term financial viability of discouraging the use of collections if one's business model is printing collections of articles. The typical reaction to not being able to generate a PDF in the way one had hoped is to not generate the PDF. Printing a book is rarely a reasonable alternative to downloading a PDF. PediaPress stepped in because they want this to work.
d. The developers have to do other work that's actually putting food on the table. Johan (WMF) (talk) 09:56, 20 January 2019 (UTC)Reply
TL;DR: This isn't a scam, it's the result of WMF prioritising working on more widely used functions leaving this to volunteer developers related to PediaPress. Johan (WMF) (talk) 10:01, 20 January 2019 (UTC)Reply

What problems??

[edit]

Now for the 3rd time I have downloaded and printed a DeWP article as PDF without any problems. The download felt a bit faster than the average of other PDF files of the same size. The page formatting leaves nothing to be desired. (However, it would be an advantage to specify a page number on each page.) The marked links in the text also work. When printing, I get exactly what I see on the screen. (Win7 prof.64-bit, Firefox 56.0.2 64-bit, Adobe Acrobat Reader DC Bestoernesto (talk) 03:13, 22 January 2019 (UTC)Reply

Why don't you block the PDF -Button if it does not work??
I tried it serveral Times - but invain. This function actually does not work -also avoid to push the PDF -Button ! 2003:D2:F717:3747:4D7D:7773:2A44:5960 (talk) 15:24, 5 February 2019 (UTC)Reply
you may download your pdf from http://mediawiki2latex.wmflabs.org/ Dirk Hünniger (talk) 19:03, 5 February 2019 (UTC)Reply

Kerning and spacing update

[edit]

The headline task was closed last summer. Is this issue fully resolved, or is some other task still open for it? Steelpillow (talk) 04:32, 22 January 2019 (UTC)Reply

Just bringing this query back to the top. Steelpillow (talk) 10:44, 31 January 2019 (UTC)Reply
well, as you probably already know in http://mediawiki2latex.wmflabs.org/ the pdf generation is done by LaTex and thus does kerning and spacing correctly. Still we had to disable microtypography since it is supported by pdflatex but not by xelatex, which we know have to use for unicode reasons.
Yours Dirk Dirk Hünniger (talk) 06:50, 2 February 2019 (UTC)Reply

Barbarian - cannot download as pdf

[edit]

https://en.wikipedia.org/wiki/Barbarian

"[...]AppData\Local\Temp\xYkAqxS7.pdf.part could not be saved, because the source file could not be read."

anon392+1 88.114.88.225 (talk) 23:12, 24 January 2019 (UTC)Reply

Noted, thanks for reporting. Johan (WMF) (talk) 06:18, 25 January 2019 (UTC)Reply
Seems to have been fixed. I couldn't download it a couple of hours ago but I can now. Steelpillow (talk) 16:55, 25 January 2019 (UTC)Reply
This is the message that I get "C:\Users\GABRIE~1\AppData\Local\Temp\o7THyHln.pdf.part could not be saved, because the source file could not be read.
Try again later, or contact the server administrator." GabeJoe55 (talk) 13:31, 29 January 2019 (UTC)Reply

Nearly none of the pages can be exported as a pdf

[edit]

Hi there,

None of the pages that i tried today to get as pdf did work e.g.

https://en.wikipedia.org/wiki/Fibonacci_number

https://de.wikipedia.org/wiki/Nikola_Tesla

https://de.wikipedia.org/wiki/Indien


Are you guys aware of this problem or is it the same problem Barbarian described?

Thak you 83.79.6.33 (talk) 13:35, 25 January 2019 (UTC)Reply

The problem with Barbarian seems to be a problem with rogue code in the page. But it is proving hard to diagnose as when I edit a page the PDF renderer still renders the old version for quite some time. Clearing the main cache makes no difference, so there is evidently some other cache being used by the renderer. I wonder if the downloader is cacheing the PDF, not picking up on the fact that it is out of date, and just serving it without thinking. Steelpillow (talk) 15:35, 25 January 2019 (UTC)Reply
with http://mediawiki2latex.wmflabs.org/ you can get a pdf of fibinacci. Feel free to test the others
Yours Dirk Dirk Hünniger (talk) 07:58, 26 January 2019 (UTC)Reply
Hi Dirk,
Thank you for this option. I tried it online, but had no success, as it always shows an error message of not having enough resources available. I will try to download the local version . 151.248.201.162 (talk) 19:50, 26 January 2019 (UTC)Reply
Hi,
you could also try the alternative server http://mediawiki2latex-large.wmflabs.org/
But you are right open source programming is a bit like real communism: bananas are not available every day.
Yours Dirk Dirk Hünniger (talk) 08:41, 27 January 2019 (UTC)Reply

https://de.wikipedia.org/wiki/Israel - Not possible to download as PDF

[edit]

The following discussion is closed. Please do not modify it. Subsequent comments should be made on the appropriate discussion page. No further edits should be made to this discussion.


https://de.wikipedia.org/wiki/Israel Haluk Uluhan (talk) 19:53, 26 January 2019 (UTC)Reply

Noted. Thank you. Johan (WMF) (talk) 02:01, 27 January 2019 (UTC)Reply
Hi,
with http://mediawiki2latex.wmflabs.org/ I do get a result for Israel.
Yours Dirk Dirk Hünniger (talk) 08:58, 27 January 2019 (UTC)Reply
The discussion above is closed. Please do not modify it. No further edits should be made to this discussion.

PDf download mistake in Chinese language

[edit]

The following discussion is closed. Please do not modify it. Subsequent comments should be made on the appropriate discussion page. No further edits should be made to this discussion.


The discussion above is closed. Please do not modify it. No further edits should be made to this discussion.

Using Chrome's "Save as PDF" print option instead

[edit]

Why don't we use the "Save as PDF" option in Chrome's print dialog instead? Click "Printable option", then click "Destination printer", and choose "Save as PDF".

But we must fix the "Download as PDF" one. This one is a temporary option. 42.113.252.255 (talk) 06:03, 31 January 2019 (UTC)Reply

Not everybody uses Chrome. Different browsers offer different print features, so there is little point in tailoring MediaWiki for any of them. Steelpillow (talk) 10:42, 31 January 2019 (UTC)Reply
all browsers print to PDF! So much work for this feature when it's not that crucial 125.168.98.111 (talk) 13:06, 24 February 2019 (UTC)Reply
Download as PDF is part of "Book" creation which is not just about printing one page in a Wiki to PDF. The full fledged book solution prints collections of pages as a "publication" with table of contents, chapters, etc and it is crucial functionality that unbelievably has gone unprovided for 2 years. Brentl999 (talk) 22:48, 28 February 2019 (UTC)Reply

Poor download service

[edit]

I am trying to download Pdf's which are probably several pages long. Absolutely no chance of doing so. Instead I receive a Word download which will not open properly and produces streams of code.

I also receive a screen note saying ''network error''

As this download problem has been in existence for sometime could someone find an answer asap 90.255.95.165 (talk) 15:40, 14 February 2019 (UTC)Reply

The new PDF renderer (for single articles) will hopefully be in production within a couple of weeks; I'll write an update here then. Johan (WMF) (talk) 15:51, 14 February 2019 (UTC)Reply
try http://mediawiki2latex.wmflabs.org/ Dirk Hünniger (talk) 08:41, 15 February 2019 (UTC)Reply

Letters of a word are mapped over each other

[edit]

for some words their letters were rendered over each other that the word became unreadable. Safety1st rail (talk) 15:59, 14 February 2019 (UTC)Reply

Thanks for the reporting! Could you tell us which article you had a problem with? Johan (WMF) (talk) 16:55, 14 February 2019 (UTC)Reply
If it was a specific article. Johan (WMF) (talk) 16:55, 14 February 2019 (UTC)Reply
try http://mediawiki2latex.wmflabs.org/ Dirk Hünniger (talk) 08:40, 15 February 2019 (UTC)Reply

Downloading Wikipedia article as .pdf

[edit]

Hello,

There is nothing wrong with the possibility but, the case of the print is too small.

Regards,


J.P. (Jan) clifford

~~~~


145.129.136.48 (talk) 16:33, 14 February 2019 (UTC)Reply

Thanks for your comment. Johan (WMF) (talk) 08:00, 17 February 2019 (UTC)Reply

Muy profesional, demostrativa,eficiente, educativa, etc....La wiki es la mejor del mundo!

[edit]

Thank you! 181.225.228.219 (talk) 15:19, 15 February 2019 (UTC)Reply

Thank you. Johan (WMF) (talk) 07:59, 17 February 2019 (UTC)Reply

Topic: Auto Diagnostic Code article

[edit]

The article is very complex and sadly at first glance I could not find any info on miss-firing issues. Will read it again latter. 32.211.214.62 (talk) 20:58, 16 February 2019 (UTC)Reply

Well, if everything works, it works. (: Johan (WMF) (talk) 07:59, 17 February 2019 (UTC)Reply

New problem with downloading Wikipedia article as PDF

[edit]

Hello, I rarely use PDF file creation. The penultimate time half a year ago. Until then everything was perfect. Has anything been changed in the software in the meantime? Today I created another PDF file and discovered the following problem. Sometimes two normally consecutive letters appear as printed on top of each other. Sometimes only 10 percent, sometimes 30, sometimes 60, 70, 90 or 100 percent. Occasionally, even the letter after next is partially covered. In addition, there are sometimes even overlaps between two letters, between which there is actually a space. The whole thing occurs very differently frequently. Occasionally even two times in a line, but also once a whole page not at all. In between everything is possible. No system is recognizable. I use exactly the same constellation as half a year ago: Win7 (64bit), FF56.0.2, AdobeReader.--Bestoernesto (talk) 01:50, 18 February 2019 (UTC)Reply

try http://mediawiki2latex.wmflabs.org/ Dirk Hünniger (talk) 18:30, 18 February 2019 (UTC)Reply

Problem with downloading PDFs. They won't.

[edit]

Tried downloading Solar System, Uranus and Neptune pages each day this week 19th Feb 2019.  No joy. 171.33.194.55 (talk) 12:52, 19 February 2019 (UTC)Reply

try http://mediawiki2latex.wmflabs.org/ Dirk Hünniger (talk) 19:17, 19 February 2019 (UTC)Reply
Nope, nice try but it says 'not enough resources to process'.
The wiki PDF problem seems only to have arisen past 10 days. Any knowledge of when it might be fixed? 5.150.98.4 (talk) 12:45, 20 February 2019 (UTC)Reply
you need to try again later. Alternatively you can try the second server
http://mediawiki2latex-large.wmflabs.org/ Dirk Hünniger (talk) 18:21, 20 February 2019 (UTC)Reply
I cannot download the PDF files.please help me.Thank You. 223.233.6.40 (talk) 09:30, 27 February 2019 (UTC)Reply
which of the servers did you try?
http://mediawiki2latex.wmflabs.org/
or
http://mediawiki2latex-large.wmflabs.org/
Could you post a link to the article you tried to convert? Dirk Hünniger (talk) 18:09, 27 February 2019 (UTC)Reply

Further to "Problem with downloading PDFs. They won't" below

[edit]

I too tried converting the wiki Solar system page and it would not work.


I got a message saying 'could not read source code'


But converting wiki/Alcubierre_drive to a PDF did. No problem.


Clearly there is some thing wrong with some pages.


The folk at wikipedia need to look at this. They might start by comparing the source code of both pages! 171.33.194.55 (talk) 13:56, 26 February 2019 (UTC)Reply

try http://mediawiki2latex.wmflabs.org/ Dirk Hünniger (talk) 17:36, 26 February 2019 (UTC)Reply

Updates

[edit]

The following discussion is closed. Please do not modify it. Subsequent comments should be made on the appropriate discussion page. No further edits should be made to this discussion.


It is now six months since the last update on the main page a year since the last update at the head of this talk page. Please could somebody take the time out to update these pages and inform us what, if anything, has genuinely happened since last time? Even "Progress stalled again" would be better than no update. Steelpillow (talk) 15:45, 26 February 2019 (UTC)Reply

Noted. Johan (WMF) (talk) 17:41, 26 February 2019 (UTC)Reply
The discussion above is closed. Please do not modify it. No further edits should be made to this discussion.

Single-page PDFs

[edit]

We're seeing some problems with the rendering of single-page PDFs. We're aware of this problem. We're launching the new renderer any day now(tm), which will hopefully solve them. Johan (WMF) (talk) 17:41, 26 February 2019 (UTC)Reply

Can you please clarify, what is the currently deployed renderer and what will the new one be? The current one on en.Wikipedia appears to be produced by Skia/PDF m58 and created by (presumably headless) Chromium. I have found a page on this MediaWiki for Proton, but it is not informative about rollout status. Also, what is Skia/PDF m58 and is it about to change? Steelpillow (talk) 11:39, 14 March 2019 (UTC)Reply
Hi!
Skia is an open-source graphics library, it's been relevant for us for quite some time so it's not a new thing:
https://en.wikipedia.org/wiki/Skia_Graphics_Engine
https://www.mediawiki.org/wiki/Proton is indeed the new renderer. It's been running in the background for testing for weeks, but we haven't actually made the switch yet. (Which I thought we would have. My apologies for saying "soon!" a bit too often. This thing that always seems to be two weeks away.) Johan (WMF) (talk) 14:23, 15 March 2019 (UTC)Reply
https://phabricator.wikimedia.org/tag/proton/ is the tag for all related Phabricator tasks. Johan (WMF) (talk) 14:23, 15 March 2019 (UTC)Reply
OK, thanks. Further to that I have now found out that the current renderer is Electron, which also uses Chromium (though whether headless or not I don't know). It take it that it will remain unchanged until the relevant two weeks for Proton finally make their appearance. Steelpillow (talk) 17:41, 15 March 2019 (UTC)Reply
Yes, sorry, should have clarified that as well. Johan (WMF) (talk) 20:58, 15 March 2019 (UTC)Reply

Replace PDF System

[edit]

The following discussion is closed. Please do not modify it. Subsequent comments should be made on the appropriate discussion page. No further edits should be made to this discussion.


You can download and use "Libreoffice". As the Latin name implies, it is free. Available for the trash (aka microsoft), Linux, Unix, Mac,

VMS and all other systems. 2001:579:811C:322:D9E9:5FD4:E555:C45E (talk) 17:00, 5 March 2019 (UTC)Reply

They're not really the same use cases, though. Part of the point of the PDF is reliable, consistent presentation in a reasonably aesthetically pleasing manner. Johan (WMF) (talk) 09:03, 6 March 2019 (UTC)Reply
The discussion above is closed. Please do not modify it. No further edits should be made to this discussion.

Update on books

[edit]

Will a new update on books being coming out at some point? @Johan (WMF) @OVasileva (WMF) PhotographerTom (talk) 20:23, 5 March 2019 (UTC)Reply

We have been putting it off because we don't have that much new to say (although some comments have been left here on the talk page), but we should probably update just to give an idea of what's happening even if it's not much. This should come once https://phabricator.wikimedia.org/T186748 has properly taken over single-page PDF rendering (which should be soon), so that we can do both at one time. (: Johan (WMF) (talk) 09:02, 6 March 2019 (UTC)Reply
Well I still dont understand why you stopped the book creation and failed to fix it for such a long period. There are no precedents or parallels in all the industry of software. You did not also describe WHAT was the issue and I gather many readers would be able to understand. I suspect some other reason, and we're not ready to see a new version. 2A01:E35:2FF4:4170:1CE6:2DF:EE04:CC0F (talk) 22:26, 27 March 2019 (UTC)Reply
Having worked in software development outside of the Wikimedia Foundation, I'd like to point out that it's actually fairly common for a feature to disappear and potentially come back much later because some dependency had changed and it was difficult to fix there and then. However, this is usually done in a discreet manner to save face and avoid complaints; the explanation is rarely "eh, it broke down and we couldn't (assign the resources to) fix it right now" even when that's what actually happened.
The short answer is that the old software was breaking down, and we had to replace it (hurriedly, without proper time to come up with a perfect solution). The new solution turned out to work for individual articles, not for books, because rendering books is far more difficult simply because the files are bigger.
Additionally, we simply don't have the resources – the number of hands writing or directing code – to do everything we'd like to do, and having taken a look at the numbers of book-to-PDF downloads it was decided that it was difficult to defend spending as much time and resources on it from the Wikimedia Foundation as would have been necessary, given that it would have taken that time and those resources from other projects. PediaPress agreed to take over, but they do it in their spare time. Johan (WMF) (talk) 01:44, 28 March 2019 (UTC)Reply
Let us say it's a consistent answer, but I am still surprised. Anyway like most digital books, the result is totally unreadable because reading intensively on a screen is wearisome. It was though an astute feature allowing one to get a rather complete documentation. For instance I am an amateur in history, and it was fun to click on all the hyperlinks an create a small book. Still I am not entirely convinced by your answer. I fail to understand why it is so complicated in code writing to make a sequence of several hyperlinks. May be though there are aspects which I miss. But then why keep the feature altogether if you cant or wont fix it ? Nobody will really miss it, or will they ? 2A01:E35:2FF4:4170:F9:F876:BCF0:E72A (talk) 05:16, 30 March 2019 (UTC)Reply
The difficulty lies in performance issues of rendering the PDFs with the chosen method.
There are two reasons we've kept this up.
a) First, we initially hoped our new renderer would work for books as well. It was first when we came to the point of performance testing it for collections of articles we had to realise this wouldn't work.
b) Now, PediaPress are working on a solution. Johan (WMF) (talk) 12:28, 30 March 2019 (UTC)Reply
for now you can use my hobby project, what I also do in my spare time http://mediawiki2latex-large.wmflabs.org/ Dirk Hünniger (talk) 07:10, 28 March 2019 (UTC)Reply

Kein Download möglich

[edit]

Sehr geehrte Damen und Herren,

Folgende Fehlermeldung kommt, wenn ich die PDF runterladen will.

C:\Users\INSPIR~2\AppData\Local\Temp\OtwFWVVK.pdf.part konnte nicht gespeichert werden, weil die Quelldatei nicht gelesen werden konnte.

Versuchen Sie es später erneut oder kontaktieren Sie den Server-Administrator.

Irgend ein Rat?


2003:D8:B3DD:4500:5054:1DAC:A895:20F6 (talk) 08:29, 9 March 2019 (UTC)Reply

versuch mal http://mediawiki2latex.wmflabs.org/ Dirk Hünniger (talk) 09:24, 9 March 2019 (UTC)Reply
merci beaucoup pour cet article ça m'a beaucoup aidé pour mon exposer merci infiniment ! 109.214.160.197 (talk) 17:57, 9 March 2019 (UTC)Reply

computer available

[edit]

My brother just sold me his old gaming machine. It got about 7000 Passmark points, 4 cores, max. 32 GByte of ram and it's silent enough so I could run it 24 hours a day. mediawiki2latex currently needs 12 GByte for a 5000 pages book. So it could do 3 or 4 books in parallel. So if you need to convert anything or want an other mediawiki2latex server to be made available just let me know. Yours Dirk Dirk Hünniger (talk) 14:02, 10 March 2019 (UTC)Reply

Thank you Dirk for your work and dedication. Br shadow (talk) 04:52, 29 April 2019 (UTC)Reply

Python 2 end of life 1st Januar 2020

[edit]

The following discussion is closed. Please do not modify it. Subsequent comments should be made on the appropriate discussion page. No further edits should be made to this discussion.


see https://python3statement.org/

this is in 9 months from now. Please make sure that the new renderer an all required libraries work with Python 3. In particular make sure this also holds for also mwlib or ensure that mwlib is not used as part of the rendering process.

Yours Dirk Dirk Hünniger (talk) 20:58, 14 March 2019 (UTC)Reply

This goes beyond my familiarity with the technical specs and planned future work, but I've pinged folks to give you a proper answer. Johan (WMF) (talk) 20:09, 15 March 2019 (UTC)Reply
Hi,
I was informed now what was going on. Indeed mwlib is not involved. The new renderer is going to be Proton: https://www.mediawiki.org/wiki/Proton. I would be nice if someone with better skills in English than me could write that as a update on the Wikipage this discussion page belongs to. (so the page you see when you click on the page tab on top of this discussion page)
Yours Dirk Dirk Hünniger (talk) 20:01, 17 March 2019 (UTC)Reply
For anyone else following the conversation:
https://lists.wikimedia.org/pipermail/wikitech-ambassadors/2019-March/002101.html
https://lists.wikimedia.org/pipermail/wikitech-ambassadors/2019-March/002103.html
I'll take a look at the wiki page (but not a late Sunday evening). Johan (WMF) (talk) 22:45, 17 March 2019 (UTC)Reply
@OVasileva (WMF) is writing a new update. Johan (WMF) (talk) 16:09, 18 March 2019 (UTC)Reply
The discussion above is closed. Please do not modify it. No further edits should be made to this discussion.

pdf download isn' working!

[edit]

pdf download isn' working!


Efes34 (talk) 00:25, 15 March 2019 (UTC)Reply

try http://mediawiki2latex.wmflabs.org/ Dirk Hünniger (talk) 07:01, 15 March 2019 (UTC)Reply
While MediaWiki2Latex is a pretty cool solution (kudos) it does not address the issue reported here. It however appears to be a great alternative to what WMF should probably provide as it used to in the past. Keeping fingers crossed for a working solution.
Anyways I have tried it and it tells me that the conversion completed but it does not allow me to retrieve the PDF document. So for me this alternative was a dead end, too. [[kgh]] (talk) 08:57, 15 March 2019 (UTC)Reply
Ah probably a Firefox issue. I just tried with Chrome and here I am being presented with a file. [[kgh]] (talk) 09:11, 15 March 2019 (UTC)Reply
yes you have to consult the documentation of Firefox. As soon as the pdf has been downloaded the down pointing arrow in the upper right corner turns light blue. You have to click on that arrow in order to access the downloaded pdf file. You are already the second user reporting this issue to me, but I think I am the wrong person to fix it, and it should be solved by the Firefox team.
Yours Dirk Dirk Hünniger (talk) 17:50, 15 March 2019 (UTC)Reply
Hi,
as soon as the conversion is finished the following message is diplayed.
"Conversion Finished. Click on the arrow in the right upper corner of your browser in order to view the result."
This has been implemented and deployed on the servers.
Yours Dirk Dirk Hünniger (talk) 18:24, 15 March 2019 (UTC)Reply
Indeed. Firefox silently downloads here. Somehow I have the feeling that it is different from other downloads with Firefox.
> "Conversion Finished. Click on the arrow in the right upper corner of your browser in order to view the result."
Yeah, this will help a lot! [[kgh]] (talk) 22:41, 15 March 2019 (UTC)Reply

new PediaPress PDF renderer

[edit]

The following discussion is closed. Please do not modify it. Subsequent comments should be made on the appropriate discussion page. No further edits should be made to this discussion.


Hi,

does anybody know where the source code reposity of new PediaPress open source PDF renderer is hosted? I would like to take a look at the sourcecode.

Yours Dirk Dirk Hünniger (talk) 10:49, 6 April 2019 (UTC)Reply

The PediaPress renderer is based on PrinceXML and not Open Source. Ckepper (talk) 22:19, 11 April 2019 (UTC)Reply
thank you. Its good to know that it is closed source. Dirk Hünniger (talk) 06:08, 12 April 2019 (UTC)Reply
The discussion above is closed. Please do not modify it. No further edits should be made to this discussion.
[edit]

The following discussion is closed. Please do not modify it. Subsequent comments should be made on the appropriate discussion page. No further edits should be made to this discussion.


In the March 18, 2019 announcement it is stated: "We're getting close to the deployment of our new renderer, Proton, with only a few tasks remaining" -- that link points to the science article about the subatomic particles called protons. There does not appear to be an actual WP article about the "new renderer, Proton", so to avoid confusion, that "renderer, Proton" article should be written and/or the link corrected to point to the appropriate article (or removed if not applicable). Thank you! 50.32.142.125 (talk) 23:49, 8 April 2019 (UTC)Reply

The link should be to this wiki, as Proton. Steelpillow (talk) 10:03, 9 April 2019 (UTC)Reply
Probably not worth changing old announcements retroactively. Tgr (WMF) (talk) 10:31, 8 June 2019 (UTC)Reply
The discussion above is closed. Please do not modify it. No further edits should be made to this discussion.

References, Citations, Notes blocked

[edit]

The downloaded PDF version it's very fine, but the page itself has crap blocks from a direct reading and/or copy. To contour this, really is possible to edit the HTML code but this would be a unnecessary waste of time. Please just remove these detrimental blocking (that is parts in the particular and fundamental sections, "References, Citations, Notes.") from the HTML source-code in the article-pages. 187.119.227.134 (talk) 17:10, 9 April 2019 (UTC)Reply

:I mean, this crap blocking, in the source-code of the articles, is somewhat introduced recently. This didn't happen before.

187.119.237.33 (talk) 17:15, 9 April 2019 (UTC)Reply
Can you please link to a specific page, when you talk about things you observe ? —TheDJ (Not WMF) (talkcontribs) 11:01, 10 April 2019 (UTC)Reply
Hello "TheDJ".  (@187...) is speaking of any article. That is, it seems that all the articles (or any article if you prefer) are presenting the issue that he explained above. To fix that is necessary to remove fro the HTML code the blocking function placed to these sections. By the way, "Further reading" and "External Links" sections are blocked too. Indeed the body of the articles is Ok but every section below the body of the articles (all them) behave like pictures (i.e. graphic images) when copied,... instead showing "byte-characters" as they should be. 177.79.27.126 (talk) 20:25, 10 April 2019 (UTC)Reply
OK, for me using this article resulting in this pdf, opened using the Preview PDF application on Mac OS X 10.14.3 I can copy paste the text in References and External links just fine.
Please read this and try again. —TheDJ (Not WMF) (talkcontribs) 20:31, 10 April 2019 (UTC)Reply
: Perhaps this subject needs some clarification. The problem pointed out is not really about the pdf available in the link of the articles. In spite the wiki PDF is more or less okay, nevertheless  a number of apps that convert the wiki articles in pdf now feature the detailed problems above brought up. Try to use them in your android system or pc and the described issue will be there. This is a downgrading in the articles ‘quality that shouldn’t occur, as didn’t until few weeks ago.  But I didn’t test them on iPhones or Macs, so I can’t tell on these. I’d say that sounds good to carefully re-read the above users‘ notices as well as the put suggestions.
:Having put this, now we can comment another issue. The pdf created by now in by Wikipedia, shows several text characters merged, in particular  those displayed in the below sections. I can’t tell if this juxtaposition is present for all articles, but apparently yes for most of them. Liongreet91 (talk) 14:57, 11 April 2019 (UTC)Reply
“a number of apps that convert the wiki articles in pdf now” ok so the problem is not with the software that this talk page discusses ? have you tried contacting the developers of “a number of apps”.. ? —TheDJ (Not WMF) (talkcontribs) 19:02, 11 April 2019 (UTC)Reply
> The pdf created by now in by Wikipedia, shows several text characters merged,
This is a known issue with the Electron renderer (or rather a specific version of Chrome which Electron uses). Should hopefully be fixed in the Proton release which is coming up 'real soon now' —TheDJ (Not WMF) (talkcontribs) 19:13, 11 April 2019 (UTC)Reply
https://phabricator.wikimedia.org/T220648#5105930 has the latest definition of Real Soon Now(tm). Johan (WMF) (talk) 01:18, 12 April 2019 (UTC)Reply

WTFlip?

[edit]

This is getting truly ridiculous. I am embarrassed for this Foundation from Jimbo down. Once again we have heard absolutely nothing - nothing - save a few false promises of announcements that never get made, of deadlines that are as believable as the flying spaghetti monster and pass into history quicker than a mayfly in summer. What is going on with this PDF book renderer, then? Where is the publicly accessible repos of this supposedly open-source code? Where is the opportunity for other developers to do the open-source thing and help the code along? Will somebody please, please, please tell us WTFlip! Steelpillow (talk) 06:39, 10 April 2019 (UTC)Reply

LMGTFY: https://gerrit.wikimedia.org/r/plugins/gitiles/mediawiki/services/chromium-render/+/master Jdforrester (WMF) (talk) 17:06, 10 April 2019 (UTC)Reply
Thanks for the kind thought but the Chromium solution has been abandoned for books (being kept only for articles) in favour of the PediaPress solution. If you can find a link for that, you will indeed be my hero. Steelpillow (talk) 18:16, 10 April 2019 (UTC)Reply
That's interesting, and the first I've heard of this. Can you provide a link? Jdforrester (WMF) (talk) 18:59, 10 April 2019 (UTC)Reply
https://www.mediawiki.org/wiki/Reading/Web/PDF_Functionality#Update_on_books,_April_2018TheDJ (Not WMF) (talkcontribs) 20:35, 10 April 2019 (UTC)Reply
Oh, curious. Jdforrester (WMF) (talk) 20:43, 10 April 2019 (UTC)Reply
as detailed on the above linked page, a new open source pdf renderer is going to be provided by PediaPress. I would like to look at the source code to see if there is a dependency to the mwlib library (which is also developed by PediaPress). Such a dependency would cause a stoppage of security updates due to the decommissioning of Python 2 on 1st January 2020, rendering the system undeployable. Dirk Hünniger (talk) 20:49, 10 April 2019 (UTC)Reply
Last we spoke to PediaPress, they were aiming for end of the calendar year. I've pinged them in email to let them know questions are being asked about the books-to-PDF functionality. Johan (WMF) (talk) 14:23, 11 April 2019 (UTC)Reply
Thank you Johan. However this is not my first request, or your first acknowledgement, since the end of that calendar year. Could you also ask them to give details of the repos for their new renderer, so that we can confirm it is open source and can see what is going on without pestering the developer unduly? Dirk has already asked in another thread below here, but has not been answered. Steelpillow (talk) 14:42, 11 April 2019 (UTC)Reply
Yes, I mentioned that too. Johan (WMF) (talk) 15:35, 11 April 2019 (UTC)Reply
But I'd like to remind everyone that if you've got a beef with how long this is taking, the only sensible target of that is our (the WMF) priorities in not assigning resources to this particular functionality, not volunteer developers.
(We think it makes sense, of course, looking at what we'd have had to defund otherwise, or we wouldn't have prioritised that way. But still.) Johan (WMF) (talk) 15:41, 11 April 2019 (UTC)Reply
My beef with the current developer is not the length of time, which they give freely, but the lack of visibility of what is supposed to be an open-source initiative. They really ought to be giving that visibility freely, too. Both periodic communications, even if only "Sorry, I've been busy elsewhere this quarter", and sight of the repository would help a lot. For example a developer going quiet usually means a developer not making progress. And sight of the architecture is pretty darn important if the WMF want to de-risk yet a third fiasco in a row. Do you? Steelpillow (talk) 18:46, 11 April 2019 (UTC)Reply
I am not sure if I missed a previous post but I don't have any intention to be secretive on purpose. The past quarter was indeed really busy and the next two will most likely be as well. Nevertheless, I intend to continue and finish the project. Part of the project relies on 10+ year old PediaPress infrastructure that needs to be upgraded before doing the next steps. One of the servers was already upgraded two weeks ago but we need to setup an additional new render server for the project and stabilize the whole render process. This needs time for investigation and fixing and I don't know when I will find this time. I am sorry about the delays. Ckepper (talk) 22:29, 11 April 2019 (UTC)Reply
@ckepper Thank you for the update. Is the code that you have written open-source? Is the repository accessible to third parties? If not, are there any plans for that? Steelpillow (talk) 07:42, 12 April 2019 (UTC)Reply
No, the code is not open source. Since we are using a commercial rendering component and plan to run the service on our own infrastructure, we didn't really see a need to open source it. But this might change of course - if people have a long-term interest in contributing to the project or if PediaPress could no longer operate it...
Also, PediaPress potentially might offer customized PDF rendering for non-WMF projects (think enterprise wikis) as a commercial product. Open sourcing this project would expose many of our ideas and eventually let us loose our edge in this field. However, especially open source often work very successfully with such a model. As you can hear, this is not an easy decision for me and I have to think about it.
What are your reasons for asking to open source the project? Ckepper (talk) 09:16, 12 April 2019 (UTC)Reply
This on the home page of the Wikimedia Foundation: "From site reliability to machine learning, our open-source technology makes Wikipedia faster, more reliable, and more accessible worldwide." and on the linked Technology page: "We keep Wikimedia projects fast, reliable, and available to all." I do not understand how this vow can be honoured if the book creation function is closed source.
Also of course, if you guys shut up shop for any reason (these things happen), then who is to maintain our book renderer, and how?
Perhaps one of our WMF participants such as Jdforrester (WMF) or Johan (WMF) could answer these points? Steelpillow (talk) 12:46, 12 April 2019 (UTC)Reply
I'm entirely uninvolved in this project. You showed up spreading profanity and I tried to help you, but I failed because you hadn't explained your concern and I was answering the wrong question. Sorry. Jdforrester (WMF) (talk) 17:54, 12 April 2019 (UTC)Reply
It seems that you have no intention of clarifying the WMF's position. Let us hope that somebody a little more civil will be able to. Steelpillow (talk) 19:49, 12 April 2019 (UTC)Reply
@Steelpillow, as @Jdforrester (WMF) says, he doesn't work on this project, and that means he's got about the same level of access to information as you have. He's not obfuscating. We're not a monolith. There's no internal collection of decisions.
I've pinged the person best suited to reply to you and pointed them to this thread. Johan (WMF) (talk) 11:35, 16 April 2019 (UTC)Reply
See this comment. Johan (WMF) (talk) 14:40, 17 April 2019 (UTC)Reply
Hi,
to my understanding the task to create PDF versions (as well as a few other file formats) of Wikipedia Books is currently solved by mediawiki2latex, which is currently functional, which is open sourced, and only depends on open source components, which can be understood be seeing that it is part of Debian, which may be used online, but may also be installed locally on the most common operating systems, of which can easily assured that it does not depend on python 2 by looking at the source code.
Furthermore it is not clear whether or not a functional alternative will be developed, if it is going to be open sourced, if it will be possible to install it locally, or if it will depend on any non open source components, or if it will depend on python 2 since the source code is currently not available.
So to me it is clear that it is necessary for me to keep on developing mediawiki2latex, although I know about other things I could do in my free time.
Yours Dirk Dirk Hünniger (talk) 17:32, 12 April 2019 (UTC)Reply
Thank you Dirk. When we consider that your Haskell implementation was refused due to the suggested problem of finding programmers able to provide alternative support, the adoption by WMF of a closed-source core for which alternative support is genuinely impossible, becomes less easy to understand. Could the WMF please explain their thinking on this a little more fully than they have done in the past? Steelpillow (talk) 17:38, 12 April 2019 (UTC)Reply
Causation and correlation. Both are expensive for the foundation. One just a little more so.
I think that is pretty much the point here. Rendering HTML to books at the level of quality our community would expect, is just not something the foundation wishes to spend its money on. Thats a valid pov, even if a few people disagree about it. Considering pediapress also doesn't seem too keen to throw lots of money at it again, that isn't totally crazy. —TheDJ (Not WMF) (talkcontribs) 10:25, 13 April 2019 (UTC)Reply
OK, so we now know that the books project is on a very slow train to a proprietary solution that cannot be maintained by the community, and the WMF are happier with that than any other option open to them.
I am surprised to find that the maintainability issue, which killed both the old OCG and adoption of Dirk's Haskell solution, is no longer seen as an issue worth addressing. Thank you for making that plain.
My thanks too to all those who have given constructive replies to the questions I raised. Steelpillow (talk) 17:20, 13 April 2019 (UTC)Reply
"the WMF are happier with that than any other option open to them" well happy is overstating I think. The foundation has limitations and needs to make choices. See also why they halted much of the work going into Maps and Graphs at some point. There is a considerable over ask and an eternal backlog of problems and wishes that could be worked on. To handle all of it, you could probably higher a 2500 people and still only barely keep up. Contrast this with the fact that people already complain that WMF spends too much money on developers and programs and you have identified the primary problem in my point of view. —TheDJ (Not WMF) (talkcontribs) 13:59, 17 April 2019 (UTC)Reply
@Ckepper, hey listen. It's important you try your best. Thank you for the good work you do, and please don't be discouraged if people get frustrated. We're all only human.
I appreciate this response! :) –MJLTalk 01:04, 12 April 2019 (UTC)Reply

When will this be ready?

[edit]

The following discussion is closed. Please do not modify it. Subsequent comments should be made on the appropriate discussion page. No further edits should be made to this discussion.


I'm just wondering if anyone knows about when this will be ready for use on Wikipedia? ARZ100 (talk) 17:36, 10 April 2019 (UTC)Reply

Mañana Steelpillow (talk) 18:17, 10 April 2019 (UTC)Reply
Mañana? Google translate says that means "morning" in spanish. If that is true what do you mean by "morning"? ARZ100 (talk) 18:26, 10 April 2019 (UTC)Reply
"Mañana" is a traditional reply when meaning to imply "I have no idea but probably not for a long time, if ever." Steelpillow (talk) 21:08, 10 April 2019 (UTC)Reply
is a what? ARZ100 (talk) 21:10, 10 April 2019 (UTC)Reply
See above.
See also https://www.mediawiki.org/w/index.php?title=Talk%3AReading/Web/PDF%20Functionality/2019#h-WTFlip%3F-2019-04-10T06%3A39%3A00.000Z Steelpillow (talk) 21:13, 10 April 2019 (UTC)Reply
Depends on whether you mean the book-to-PDF function or the single-page PDF renderer, @ARZ100. The book renderer? Depends entirely on the volunteer developers who took over. The Foundation looked at number of daily downloads and pretty much decided that it couldn't defend assigning the resources necessary at the expense of other projects when the solution for single-page PDFs didn't work, which meant that PediaPress took over, and they're handling it in their spare time. The single-page PDF renderer? That's a different thing. Johan (WMF) (talk) 22:16, 10 April 2019 (UTC)Reply
Ok thanks @Steelpillow and @Johan (WMF) ARZ100 (talk) 22:24, 10 April 2019 (UTC)Reply
@ARZ100, honestly, I have told you nothing of substance. But are you interested in the single-page PDF renderer (that is, just for one article) or the books-to-PDF renderer (that is, when you take more articles and put them together and then make a PDF out of them)? Johan (WMF) (talk) 23:05, 10 April 2019 (UTC)Reply
@Johan (WMF) I am not @ARZ100, but I will say that I am exclusively interested in the books-to-PDF renderer. I've assumed watchlisting Reading/Web/PDF Functionality would be the best way to get updates on that. Am I right about that? –MJLTalk 01:11, 11 April 2019 (UTC)Reply
@Johan (WMF) you have answered my question. I don't need any more information. ARZ100 (talk) 02:23, 11 April 2019 (UTC)Reply
@MJL Yes. Johan (WMF) (talk) 11:44, 11 April 2019 (UTC)Reply
The discussion above is closed. Please do not modify it. No further edits should be made to this discussion.

The benefits to PediaPress of Going Open Source

[edit]

@Ckepper asked about the potential benefits of PediaPress going open source with this project in this thread. I wanted to give them some good takeaways to bring back to their company as it relates to this specific project.

Commercialization of a given project is an important reason to want to keep it closed source. However, I believe it would impede the success of the renderer long term were it to remain closed source. As things stand, I do not think it will be as successful as Extension:Collection because, for starters, it would not be free to install for most. Small wikis would not be able to afford much in licensing. Open source also instills a kind of trust that any large company, nonprofit, or single individual can rely on. It shows you are so confident with your product that you will extend that to showing it for the world to see in its most basic form: code.

On a different note, as is reported on the company's website, "[you] offer consulting, customization, and support for advanced document transformation solutions." This is nothing small right here, and I am confident in that business model. If, however, you believe otherwise, there is still ways to protect copyright without going closed source. In this case, I would look to Chromium for guidance in what potential path you can take. Not every one of your ideas needs to be included in an open source repository, so you can still maintain the parts you want secret or just to yourselves.

The principle rendering service should, however, be available to the public to do bug tests and the like. It's a win-win. Consulting and customization are where the real money is anyways. You could also branch into hosting this rendering service for others similar to how you already offer print-on-demand books to any mediawiki-wiki. Wikis will always need to pay for this if they want the product beyond what is already offered out there.

Finally, it is a strong selling point for a company with such strong ties to the open-source movement! I hope this helps you make the right decision on this matter. –MJLTalk 03:45, 13 April 2019 (UTC)Reply

Thank you for your comment. After talking with colleagues and other stakeholders, we have made the decision to release mwlib.html as open source when the project is sufficiently mature. This should help to ensure its long-term viability.
Also, I enabled the new render server so that rendering on https://pediapress.com/collector should work again (and be more stable). Ckepper (talk) 10:01, 17 April 2019 (UTC)Reply
That's awesome news! Major thanks goes out to your organisation for its willingness to do that. If there is anything you all need from the community (like press releases*, bug testing, etc.) please reach out! I just tried the collector on Simple:Spooky Scary Skeletons, and I think it really looks great!! Very elegant! :D
*I run Wikisource News (en) now, so I can help with publishing and writing it!MJLTalk 15:21, 17 April 2019 (UTC)Reply
I have added Wikibooks (en) and Wikisource (en) to the test renderer. The output is still far from perfect but PediaPress was never able to generate PDFs from those sites before. Ckepper (talk) 16:18, 17 April 2019 (UTC)Reply
Hi @Ckepper, is it possible for you to add Wikipedia (ar), Wikibooks (ar) and Wikisource (ar). It's going to be a good test for right-to-left issues.

Helmoony (talk) 10:56, 27 April 2019 (UTC)Reply
Hi @Helmoony, I have added Wikipedia (ar). A few years ago (for Wikimania Haifa 2011) we created a LTR export with our old PDF renderer and that was really painful - especially since no one on our team knew Hebrew. You can start playing around with the export, but this is definitely not a priority for us right now. Ckepper (talk) 19:15, 30 April 2019 (UTC)Reply
Thank you Ckepper, I tested the version, it's not working great. When it doesn't show ''Failed to load PDF document.'', errors are mainly: text format should start from right, wikidata-based infoboxes are not showing wikidata data including OSM-based map, some terms need to be translated (e.g. Image Sources, Licenses and Contributors). But at least we know what we need to do now. Helmoony (talk) 12:54, 5 May 2019 (UTC)Reply
The render worked for me just now for arwiki. There are still some RTL issues, but I didn't see the "Failed to load" issue.
Is the source available so that we can contribute? MarkAHershberger(talk) 22:31, 30 October 2019 (UTC)Reply
Not yet, I'd like to clean it up a little bit before making it available. Ckepper (talk) 22:32, 30 October 2019 (UTC)Reply
I hope you can release it soon. The book functionality is needed! Thank you for your quick reply! MarkAHershberger(talk) 22:38, 30 October 2019 (UTC)Reply
I hear you. Maybe I don't do the full cleanup to publish it sooner. Ckepper (talk) 22:40, 30 October 2019 (UTC)Reply
That would be awesome! Ugly code that works is better than no code. MarkAHershberger(talk) 22:42, 30 October 2019 (UTC)Reply
There are, among those of us who use MediaWiki to run KM systems outside of Wikipedia, some absolutely essential extensions whose code is hideous.
I'm glad you want clean code, but I would hope that you can release the code as soon as possible and then clean up the code later. MarkAHershberger(talk) 22:49, 30 October 2019 (UTC)Reply
Yes, absolutely. A buggy alpha release v0.01 is better than no release at all. Thank you so much for keeping on with this work. Steelpillow (talk) 08:30, 31 October 2019 (UTC)Reply
@Steelpillow, agree complete. Is there any progress? Current situation April 11, 2020 is at opening Book Creator "Due to severe issues with our existing system, the Book Creator will no longer support saving a book as a PDF." A collaborative work "will always remain freely distributable and reproducible" only if I can export into another free file format like into the most common book format pdf or odt. Charis (talk) 12:50, 11 April 2020 (UTC)Reply
@Charis I have not been following progress lately. There is a test server at https://pediapress.com/collector/ which you can try. Otherwise, Ckepper is the best one to ask, as they have been the voice of PediaPress here. Steelpillow (talk) 16:24, 11 April 2020 (UTC)Reply
I also posted this on Extension talk:Collection but the failure page when trying to Download to PDF on my wiki lands here, so cross-posting.
Our wiki is running on MediaWiki 1.31.7 and using Collection 1.7.0 (af3a0b8) 14:23, 15 April 2018. The Download as PDF is constantly failing and directing the user to Reading/Web/PDF Functionality which doesn't specifically address the reason for the "Book rendering failed". Reading through Talk:Reading/Web/PDF Functionality doesn't clear up the situation much either. It does seem to indicate there is a new render server available at https://pediapress.com/collector but that doesn't seem to work for non-Wikipedia sites. The existing render server https://tools.pediapress.com/mw-serve/ does seem to still active.
Is the functionality via this extension dead for low traffic sites that don't need or cannot install (i.e shared hosting) their own PDF server? Peculiar Investor (talk) 16:09, 1 May 2020 (UTC)Reply
I got my mediawiki2latex package in ubuntu 20.04 (GPL). PDF generation seems to work fine. Furthermore I got my own rendering server, that also works with non wikimedia sites.
https://mediawiki2latex.wmflabs.org/ Dirk Hünniger (talk) 14:08, 3 May 2020 (UTC)Reply
the last LTS version of mediawiki that supported the collection extension was 1.30.x it has been decomissioned accodring to https://www.mediawiki.org/wiki/Version_lifecycle WMF have stopped any development of the collection extension according to https://www.mediawiki.org/w/index.php?title=Talk%3AReading/Web/PDF%20Functionality/2019#c-Steelpillow-2019-04-13T17%3A20%3A00.000Z-Ckepper-2019-04-11T22%3A29%3A00.000Z Dirk Hünniger (talk) 14:18, 3 May 2020 (UTC)Reply
I'm still confused, sorry, because that doesn't seem to agree with Extension:Collection which shows
MediaWiki 1.34+
as does Special:Version both here and on Wikipedia, both of which are running on
MediaWiki 1.35.0-wmf.30 (6d5d990)

12:06, 4 May 2020

Reading between this discussion and the Extension:Collection and it's associated talk page doesn't help clarify the status of the extension but more importantly whether there is a render server that low traffic wiki sites can use so that the Download to PDF functionality works. Peculiar Investor (talk) 21:33, 6 May 2020 (UTC)Reply
As ever, there is confusion between the collection extension or Book Creator and the rendering service. The old rendering service, the Offline Content Generator, has been pulled and the promised PediaPress replacement interminably delayed. Development of the collection extension/Book Creator also stopped, but it remains in use. It still generates a trickle of bug reports and issues, so periodically gets looked at to see if anything can be fixed. But this is pure volunteer effort and there seem to be no low-hanging fruit any more. Hope this helps. Steelpillow (talk) 06:10, 7 May 2020 (UTC)Reply
what else is there
pandoc: also GPL but might require some lua or haskell programmer to make it work for your case
bluespice: from 2900 EUR per year. Dirk Hünniger (talk) 14:33, 3 May 2020 (UTC)Reply

Long times or timeout with dl as pdf...

[edit]

I've complained often that dl as pdf either times out or takes up to 5 minutes from the time I click dl as pdf to the save dialog to appearing. It's doing it again.

It's great that you folks are working on books and so on but why not fix - once and for all - the basic dl as pdf so it works quickly and it works every time.

As a matter of priorities it seems to me a bugless "dl as pdf" should be at the top.of the list. If this were a commercial site it would be. If Amazon had a bug in such a basic functionality Walmart would get a big bump in business... 68.98.170.156 (talk) 13:49, 26 April 2019 (UTC)Reply

como que no funciona

[edit]

com

o que no funciona 2806:10AE:9:911F:DC71:B46F:825E:2C66 (talk) 16:32, 4 May 2019 (UTC)Reply

What doesn't work? Johan (WMF) (talk) 17:04, 4 May 2019 (UTC)Reply

Ready for single-page PDF Render Function

[edit]

I am very interested in the single-page PDF render function!! This is of great importance to me and several of my friends. We have been reading about this and trying the button offered on the main pages. I love Open Source !! I have been using Open Source operating systems and tools since 1995. I helped the "fight" for the opening-up of what is now FireFox. This is directed toward PediaPress: There are many advantages to going Open Source. I know RedHat has made a LOT of money, even before the $34 billion IBM purchase. ChEbama87 (talk) 22:59, 5 May 2019 (UTC)Reply

I have found that my Firefox browser makes a better job of rendering single articles than the current Wikipedia renderer. The content is *exactly* the same, with the extraneous wrapper and other in-page unprintables stripped out, but it is more cleanly laid out by Firefox. (Be warned that at the time of writing, many Firefox add-ons have been disabled for the last couple of days, while a fix for a security certificate blunder is sought).
I begin to wonder whether adopting and supporting their rendering engine might not be a better route that constantly reinventing unsatisfactory ones. Steelpillow (talk) 09:35, 6 May 2019 (UTC)Reply
yes, might be a good idea. Especially since it is an open source one and the Mozilla foundation tries to keep it open source. But if you prefer something commercial you could also use http://www.unipublishing.com/wb2pdf/ . But of course it is much more fun if you develop one yourself. Dirk Hünniger (talk) 18:23, 6 May 2019 (UTC)Reply

Nada

[edit]

A mi bo se me descarga nada , que estafa y a demas cuando se "descarga" aparace una imagen de una i y no aparece la información os voy a denunciar y no lo deigo en broma 85.155.60.126 (talk) 16:25, 16 May 2019 (UTC)Reply

I'm not really sure what you refer to here, I'm afraid. Johan (WMF) (talk) 00:57, 17 May 2019 (UTC)Reply
try http://mediawiki2latex.wmflabs.org/ Dirk Hünniger (talk) 11:46, 18 May 2019 (UTC)Reply

Maps with Pins not printed

[edit]

Normally in each description there is a map with multi-coloured pins, which point to the sights in the city / country.

This map is not printed in the PDFs, only a white (empty) rectangle where the map should be drawn is visible.

It would be great if this map can be also shown in the PDF. - Thanks a lot!

Example: https://en.wikivoyage.org/wiki/London/City_of_London inchapter "Get around" 2001:7F0:400C:0:0:0:0:5 (talk) 15:42, 20 May 2019 (UTC)Reply

The map in question uses a bunch of templates called Template:mapshape and suchlike. There is an [Edit GTX] link which pops open a load of XML. I have no idea how it all works.
I can get a PDF with the mapful of pins if I: 1) enable javascript for the site and 2) print to file direct from the Firefox menu. But if I either disable javascript or try to use the Wikivoyage download, both leave an empty rectangle. Certainly, the real-soon-now renderer should play nicely if we have javascript enabled and frankly it ought to if we don't. (But please, don't tell PediaPress about this, we don't want more delays to the basic book service). Steelpillow (talk) 17:04, 20 May 2019 (UTC)Reply
Thanks for reporting! Johan (WMF) (talk) 23:04, 20 May 2019 (UTC)Reply

Incorrect format in pdf

[edit]

I downloaded 'softmax function' as a pdf file. A lot of the mathematical equations were not displayed properly - I had to abandon the file. First time for me. 88.18.153.101 (talk) 14:44, 28 May 2019 (UTC)Reply

try http://mediawiki2latex.wmflabs.org/
It will take a few minutes due to high load on the server, but the formulas will come out correctly. Dirk Hünniger (talk) 15:50, 28 May 2019 (UTC)Reply
Presumably you mean Softmax function. Can you be more specific about the error? The PDF is a bit ugly (the formula font is too heavy) but I don't see anything incorrect. (It would be surprising as Proton uses Chromium's rendering functionality so it is very rare for something to look differently from how it looks in the browser.) Tgr (WMF) (talk) 10:36, 8 June 2019 (UTC)Reply

No way to download historical version

[edit]

There is no way to download historical version for PDF.

That is to say, only current version of pedia is permitted to download. 45.125.2.1 (talk) 13:07, 30 May 2019 (UTC)Reply

Yes Haskell can do that. You can just fill in an url to the old version like https://en.wikipedia.org/w/index.php?title=Homomorphism&oldid=256679 into the web form on http://mediawiki2latex.wmflabs.org/ . I never though that anybody might need such a feature so I didn't implement it, but it seems that the Haskell compiler generated the code by itself when trying to cover the most general case. Dirk Hünniger (talk) 13:29, 30 May 2019 (UTC)Reply
(Noted.) Johan (WMF) (talk) 04:20, 31 May 2019 (UTC)Reply
The related task is T213369. Tgr (WMF) (talk) 10:29, 8 June 2019 (UTC)Reply

Downgrade statt Upgrade?

[edit]

Das dauert jetzt schon sehr, sehr lange. Wäre es nicht besser, es auf die alte, funktionierende Version zurückzusetzen? Tilio (talk) 00:28, 3 June 2019 (UTC)Reply

Leider gibt es keine alte, funktionierende Version – wir brauchen eine neue Version, weil die alte nicht mehr funktioniert. Johan (WMF) (talk) 15:08, 3 June 2019 (UTC)Reply
Es gibt ja immerhin das Werk eine Rinderpflegers http://mediawiki2latex.wmflabs.org/ . Klar wäre es schön wenn es irgendwann mal wieder eine offizielle Software geben würde die dieses Problem lösen könnte. Ich befürchte das man Software nicht so wirklich gut für Geld entwickeln lassen kann. Das geht ja mit Mathematik auch nicht. Ich kann ja auch keine Firma beauftragen den Satz von Fermat zu beweisen oder das P=NP Problem zu lösen. In diesem Sinne weiterhin viel Erfolg! Dirk Hünniger (talk) 17:04, 3 June 2019 (UTC)Reply
Danke für die Information Tilio (talk) 21:45, 3 June 2019 (UTC)Reply

Download as PDF - not working

[edit]

The following discussion is closed. Please do not modify it. Subsequent comments should be made on the appropriate discussion page. No further edits should be made to this discussion.


Waiting takes forever and still no PDF downloads. Tried again and again. 173.132.21.237 (talk) 02:53, 4 June 2019 (UTC)Reply

for now you can try http://mediawiki2latex.wmflabs.org/ Dirk Hünniger (talk) 06:04, 4 June 2019 (UTC)Reply
Have you tried printing the article a to pdf file in your web browser? Firefox gives better results than the Wiki tool and I have heard good things about Chromium/Chrome too. Steelpillow (talk) 09:23, 4 June 2019 (UTC)Reply
Do you have a specific page where you are experiencing this ? Because for me it works almost instantly on most pages.
Thanks for the feedback. —TheDJ (Not WMF) (talkcontribs) 10:28, 4 June 2019 (UTC)Reply
Can you provide any specifics? (Which wiki, which article, have you tried other articles?) Tgr (WMF) (talk) 10:30, 4 June 2019 (UTC)Reply
As others have pointed out elsewhere, this report was made a few hours before the switchover, and so presumably refers to some problem with the old PDF rendering service, Electron. Since we are not using it anymore, it's not actionable. Robustness (ie. avoiding situations where the service locks down and does not respond at all) was actually the main reason for the switch, so hopefully this is better now. Tgr (WMF) (talk) 11:24, 8 June 2019 (UTC)Reply
The discussion above is closed. Please do not modify it. No further edits should be made to this discussion.

New renderer

[edit]

The elusive new renderer for single-page PDFs has finally been deployed. Please tell us if you've got any problems with creating PDFs from single articles. Johan (WMF) (talk) 10:49, 4 June 2019 (UTC)Reply

Just did a few test articles and rendering went fine and fast. While waiting (for the last year) on pdf books, I've gone my own direction with PDFCrowd as the web page renderer, and my own Windows code for book assembly. Book assembly does not seem that big of a problem, but making it user-proof probably is. Is there (or will there be?) a api-class URL to invoke the pdf rendering on a single article? -- Howard HNRSoftware (talk) 13:53, 4 June 2019 (UTC)Reply
The API is GET /rest_v1/page/pdf/{title}, the service behind it is Proton. Tgr (WMF) (talk) 14:52, 4 June 2019 (UTC)Reply
Thanks for the link -- one of the key problems with the "interweb" is figuring out where to look for something. Howard HNRSoftware (talk) 15:01, 4 June 2019 (UTC)Reply
I have been trying to create a book with single articles and multiple articles but on each occassion the Download section is all greyed out, am I doing something wrong? 82.39.122.209 (talk) 15:33, 4 June 2019 (UTC)Reply
The book download was decommisioned. A replacement is hopefully in hand, but don't wait for it. Steelpillow (talk) 19:44, 4 June 2019 (UTC)Reply
But you can get PDF from the German federal institute of applied cow care
http://mediawiki2latex-large.wmflabs.org/ Dirk Hünniger (talk) 20:23, 4 June 2019 (UTC)Reply
Um... the rest api documentation does not show a "page/pdf/" entry or an entire URL. A few trial-and-errors of things like "https://en.wikipedia.org/rest_v1/page/pdf/A_E_van_Vogt" don't get me in the right direction. Other google searches seem to lead off in various odd directions. It is probably something stupid.... Sorry - Howard HNRSoftware (talk) 15:54, 4 June 2019 (UTC)Reply
sticking in "api/" seems to execute better- I think it is my problem now... H. HNRSoftware (talk) 16:06, 4 June 2019 (UTC)Reply
Yeah, sorry, forgot to put /api/ in there. The doclink does work for me, though. What browser are you using to look at it? Tgr (WMF) (talk) 08:26, 5 June 2019 (UTC)Reply
Sorry, the doc link works perfectly, I just didn't read it right.... This morning I got my retrieval working fine with
a url of "https://en.wikipedia.org/api/rest_v1/page/pdf/Robert_Heinlein"
typical retrieval time for each pdf of a selection of articles was 4-5 seconds. Perfectly fine.

HNRSoftware (talk) 12:55, 5 June 2019 (UTC)Reply

Funding needed?

[edit]

Looking down this thread, I see complaints about lack of funding. How much is needed? and how can I contribute some so that it goes to this project and not just into WMF coffers? HNRSoftware (talk) 13:55, 4 June 2019 (UTC)Reply

I mean, "thank you" sounds so insufficient whenever I see posts like these.
But let's be honest here: it's not that the WMF lacks money but that it feels it needs to spend it on other things. That there are always things we'd like to do but we can't because our resources are limited.
But our resources are always going to be limited, and there's simply no way to donate money to areas that aren't prioritised, because there's an organisational cost to doing things, keeping track of things. The new single-page renderer was deployed today! So if that's what you're looking for, hopefully we've solved the main issues. If you're talking about books, my suggestions would be to get involved with the volunteer developers working on that and see how/if you can help them, but money to the WMF will go to general support for and development of the Wikimedia wikis. Johan (WMF) (talk) 14:21, 4 June 2019 (UTC)Reply
There are really two projects here.
The single-page pdf download has just been replaced by a new tool and is at least functional, though improvements would be nice.
The book rendering is being written on an unpaid volunteer basis by PediaPress. They are a commercial company and it remains unresolved as to how open the codebase will be. Progress has stopped for several months now. One way to move it forward might be to contact them direct and offer to pay for the work, however watch that licensing. Another way might be to support the WMF but they don't seem to care about the licensing, and as you can see targeting funds will be hard. Or you could contact Dirk Hunniger who has written an independent and open-source renderer of his own which appears to be functional but he has no funds to set up a sufficiently powerful server. Dirk will probably read this and reply.
I hope this helps. Steelpillow (talk) 14:27, 4 June 2019 (UTC)Reply
Kind-of what I had guessed. Every so often I contribute to WMF, just because it is a great concept and (eventually) the whole internet may work more cooperatively. I'm a retired software engineer and don't have serious web skills, so I doubt if I could contribute skill-wise to the pdf book project. One thing I do not see here is a design document for pdf functionality. Scanning down this thread, I can derive most of the decision points, but it is really not clear precisely what the vision is. Yes, wiki articles to pdfs and assembling pdfs into printable "books", but the varying styles of related articles will make it very difficult to integrate the articles into a smooth book without manual intervention at some point in the process.
My personal interest is consolidating SciFi author (of interest to me, not all possible authors) articles, and some other things like Raspberry Pi microcomputers and similar things. These would probably be better served with a bookmark organizer than a pdf book creator, although consolidating to pdf has a lot of attraction. One of the key significances to pdf articles is that the links are "live", and I can actually use the pdf as a starting point for further reading. HNRSoftware (talk) 14:54, 4 June 2019 (UTC)Reply
Well I can not accept any funding either. I am a government employee and not supposed to have any other income. If you want to set up a server for yourself its just as easy as installing ubuntu and
sudo apt-get install mediawiki2latex
mediawiki2latex -s 80
The point why I don't set up a publicly reachable server is not so much a lack of funding but more the laws made by the govermernt I am working for. There is quite a high risk that I spend the rest of my live in prison if I do so. The law is called Vorratdatenspreicherung. Dirk Hünniger (talk) 16:58, 4 June 2019 (UTC)Reply
If you really got too much money and want to spend it on PDF Tool development, you could hire Henning Thielemann (PhD). He is an experienced Haskell freelancer and has worked on mediawiki2latex before. He recently told me, that he is still interested in working on it, provided that funding is available. Dirk Hünniger (talk) 17:55, 5 June 2019 (UTC)Reply
Sounds to me as though Henning could apply for a rapid grant to get a dedicated server to run book-to-pdf batches every month. Then if that works, he could apply for a normal project grant for extensions/expansion. Sj (talk) 16:57, 10 June 2019 (UTC)Reply

Proton workout

[edit]

First, a huge thank you to everybody who stuck with it and made this happen.

Top-level note to take home: Firefox is still ahead but the gap is closing. A little more work and Proton can do better!

Summary: styles and hyperlinks need attention. Everything else looks good.

I looked at w:Wing configuration, w:Euler characteristic and w:Supermarine Spitfire prototype K5054. I did not test much in the way of weird boxouts - or overlong equations as editors seem to break them up on the page. I looked at the pages both in the new renderer and in my Firefox browser Print to file option, which goes via Cairo. Where I do not mention Firefox below, it did OK.

I saw no obvious kerning issues. We have nice tables! wa-hey! Basic infoboxes and templates generally render OK, except for certain diagrams - see below. Stuff that should be stripped out mostly is, except as below.

The page title and Contents and top-level headings are in serif font, subheadings are sans. The main paragraph text is in serif font but the bulleted and numbered lists, indented text and everything else except headings are in sans.

This makes a particular visual clash in the Spitfire article, where w:Template:Aircraft specs invokes w:Template:Big and then bolds it to create pseudo-subheadings for bulleted lists: the pseudo-subheadings are in serif and the lists in sans, it looks horrible.

The main text font is too small. Now I know codeies here have a fondness for small fonts but while this size might be just OK for multi-column layout it is hopeless for the single-column layout Wikipedia uses. A balance has to be struck between that and the "12 to 14 pt for younger, older and less capable readers" brigade. We should not force extremes, we should be average. A slightly larger font would mean that the white space between lines could be reduced, leaving overall line spacing the same.

By contrast, indented text using : wikitext, as well as text in table cells, are both a bit too large.

Equations are rendered in heavy bold. This is very bad because bold fonts are used extensively in mathematics to distinguish things from symbols in standard weight. Everything in the pdf is such a mess that it is hard to tell and it looks awful anyway. Firefox has similar issue, though individual details differ as to which is the worse.

Image captions are correctly styled per bullet lists but are colored gray, which I find mannered and unlikely to remain fashionable for long: see also below.

Internal links have been retained and underlined. I am sorry, I just do not get the value of downloading an article from Wikipedia in page-based format just to click back to the online version of the next article. And it has disadvantages:

1. It looks a real mess when printed.

2. It cannot work when offline and is pretty darn insane when online too, it cofuses the heck out of my apps as well as me.

Oddly enough the Firefox PDF renderer (using Cairo) adds the same underlining but strips the links themselves out: the underlining totally loses its user value. Getting rid of the underlining altogether would be a good solution for both renderers.

Images remain with clickable links. These should be removed. An unintended consequence for the diagrams int he tables is that they all have grey a hyperlink "underlining" running across the middle, per the above. Oddly, the hyperlinked images in thumbnails do not have such a line.

References are in gray with the hyperlinks in black. At best this is the wrong way round, it looks weird again. It is more usual to color the hyperlinks, especially in academic materials which is what most downloads seem to be used for. Oddly, the stylistic gimmick is also seen in the Firefox output but please do not let that stop you from doing it properly.

Hope this helps. Steelpillow (talk) 19:37, 4 June 2019 (UTC)Reply

Disagree about your points regarding links.. i definitely want to keep them. Anyway, with the exception of a few font issues (see bold math), most things print exactly the same for me using MacOS Print-to-PDF, as from this.
The fontsize is 13.3pt btw. but it seems that the renderer uses a dynamic viewport size, which seems to affect sizing more than the defined font-size. —TheDJ (Not WMF) (talkcontribs) 23:07, 4 June 2019 (UTC)Reply
If links must stay, please at least get rid of the "underlining" right across the middles of diagrams. I assume it is a transparent background to the diagram that lets the thing through. But you know, if you are clicking through to Wikipedia anyway then why not just read it on Wikipedia in the first place? What is your use case or user story here?
Oh, FFS the dancing font sizes! My Google Chrome on Android renders different paragraphs in the same text block or items in a list in different sizes, it's hideous. This is not quite as bad here but ISTR there is some provision for overriding it in CSS. If not then I have to say that design for mobile and design for A4 page are not the same thing, the wrong choice of rendering engine was made and Proton can never make the grade; Cairo would have been a better bet. But I'll try and find the time in a day or two to check out that CSS half-memory. Steelpillow (talk) 05:42, 5 June 2019 (UTC)Reply
> If links must stay, please at least get rid of the "underlining" right across the middles of diagrams.
Agreed: I filed https://phabricator.wikimedia.org/T225093TheDJ (Not WMF) (talkcontribs) 13:41, 5 June 2019 (UTC)Reply

Pb avec le téléchargement PDF

[edit]

Très simple : vous n'avez qu'à aller sur la "version imprimable" et demander que le texte soit imprimer en version pdf.... Cela fonctionne sans aucun souci .En tout cas, sur Mac... 2A01:CB00:865C:AE00:B9:47D3:8F37:7153 (talk) 07:17, 11 June 2019 (UTC)Reply

-Extended periodic table (detailed cells)- download PDF not working well

[edit]

Extended periodic table (detailed cells): https://en.wikipedia.org/wiki/Extended_periodic_table_(detailed_cells)

When I download the PDF, the periodic table is too largre that goes0 off the screen.

Does there anyway to fix this? Andy181209 (talk) 11:50, 13 June 2019 (UTC)Reply

In a word, No. The table is ridiculously huge and cannot possibly be condensed into a meaningful page-based format.
Or, to put it another way, the table would have to be broken down at source into a multi-page friendly format first. Steelpillow (talk) 12:43, 13 June 2019 (UTC)Reply
I tried with http://mediawiki2latex.wmflabs.org/ and the result looks really funny. If you got the okular pdf viewer you can zoom into the table at 1600 % and on a UHD screen you can actually read some elements, still it prints over the margin of the page. So yes I tried my best, but a satisfying result seems unlikely to achieve. Dirk Hünniger (talk) 20:42, 13 June 2019 (UTC)Reply

merci pour ce information

[edit]

The following discussion is closed. Please do not modify it. Subsequent comments should be made on the appropriate discussion page. No further edits should be made to this discussion.


je vous j'adore 185.10.166.2 (talk) 13:07, 17 June 2019 (UTC)Reply

The discussion above is closed. Please do not modify it. No further edits should be made to this discussion.

Please install additional font on the pdf services for CJK characters.

[edit]

Default font looks terrible. Source Hans Sans + Source Hans Serif would be a great option. All it takes is to install the font set. Viztor (talk) 10:51, 21 June 2019 (UTC)Reply

Tracked in T226633. Tgr (WMF) (talk) 12:22, 26 June 2019 (UTC)Reply

Formato epub para exportar

[edit]

Hola, ¿habrá posibilidad de exportar a epub? Gracias 83.55.198.197 (talk) 09:57, 27 June 2019 (UTC)Reply

Right now, https://mediawiki2latex-large.wmflabs.org/ can do that for you. Johan (WMF) (talk) 13:35, 27 June 2019 (UTC)Reply

Rendering Collections fast

[edit]

Hi,

I propose to use wkhtmltopdf to get PDFs fast. I wrote a Java program to demonstrate that:

https://sourceforge.net/p/wb2pdf/git/ci/master/tree/src/Tachyon.java

the demo run on the following book

https://en.wikipedia.org/wiki/Book:Ancient_Egyptian_Titulary

sudo apt-get install wkhtmltopdf
javac Tachyon.java
time java Tachyon
real 0m5.013s
user 0m3.509s
sys  0m0.146s
evince output.pdf

Yours Dirk Dirk Hünniger (talk) 18:01, 13 July 2019 (UTC)Reply

Looks good. I do not have time to test it myself, but it is crystal clear that the WMF need to wake up and smell the coffee.
Your main code may be written in haskell, but how is that worse than a proprietary volunteer working on a (very) occasional basis and keeping their code closed source? Steelpillow (talk) 18:55, 13 July 2019 (UTC)Reply
I just found out that there seems to be a problem with wkhtmltopdf on large documents. I also found that you can easy solve it by directly using qtwebkit from c++. I put the sources online here. https://sourceforge.net/p/wb2pdf/git/ci/master/tree/src/qt/ . I was able to create PDF of https://en.wikipedia.org/wiki/Book:3D_bone_printing within a few minutes. Dirk Hünniger (talk) 06:53, 14 July 2019 (UTC)Reply

Failed rendering

[edit]

https://newtools.pediapress.com/?command=download&writer=html&collection_id=44f8093d2dc84800 for w:en:Wikipedia:Recent additions/2019/June. Using Firefox 68.0 on


Omotecho (talk) 22:24, 15 July 2019 (UTC)Reply

Book maker formats in Persian language

[edit]

Hi. I noticed that the book maker is not working well in Farsi.

For example, instead of this

سلام

is writing:

م ا ل س

With respect.‍‍‍ Mobin2008 (talk) 18:38, 17 July 2019 (UTC)Reply

Wikipedia Books discussion

[edit]

There is a proposal to Suppress rendering of Template:Wikipedia books on the English Wikipedia.

The accompanying discussion is suggesting that support for the longstanding PediaPress print-on-demand service should be removed because it is a pay-for service. I am sure that the WMF have some arrangement with PediaPress about this, and the last I heard PediaPress were also working on a replacement PDF renderer. Then there is Dirk Huenniger's MediWiki2LaTeX which is currently hosted by wmflabs at both http://mediawiki2latex.wmflabs.org/ and https://mediawiki2latex-large.wmflabs.org/ . Although the current proposal does not affect them immediately, it is also being argued that no support for any external services should be given. That would reduce the current Wikipedia Books function to nothing more than passive reading lists.

You are invited to join the discussion. Steelpillow (talk) 10:28, 6 September 2019 (UTC)Reply

The book created from https://en.wikisource.org/wiki/A_Simplified_Grammar_of_the_Swedish_Language is hardly legible without almost any of the beef of text when previewed on pediapress.com and currently unpreviewable as PDF from en.wikisource.org directly. 2001:16B8:5C64:FA00:C0BE:7F14:48C8:A226 (talk) 10:05, 25 September 2019 (UTC)Reply

However, the EPUB book download is legibly complete at least. 2001:16B8:5C64:FA00:C0BE:7F14:48C8:A226 (talk) 11:05, 25 September 2019 (UTC)Reply

No updates since June?

[edit]

Come on already.  Every single app out there supports PDF.  How difficult is this? The whole paper usage while printing is a total red herring. 74.101.127.239 (talk) 17:32, 11 October 2019 (UTC)Reply

Looking at the last paragraph the update on 15 of July it is very clear that there is currently no development going on in rendering books in any downloadable format, in particular including PDF. Furthermore it is also stated there, that no such development is currently planned.
If you refer to mediawikil2atex (one hour ≈ 200 pages in PDF) with your comment on paper usage, I can assure you that the limit is necessary since the hardware resources allocated by WMF together with the current implementation of mediawiki2latex don't reasonably allow for converting significantly bigger documents.
Still the downloadable command line version of mediawkiki2latex does not include any such limits, and should be able to fulfil everyone needs in terms of PDF size on a reasonably priced personal computer made for consumer use. Dirk Hünniger (talk) 14:21, 19 October 2019 (UTC)Reply

¿Cuándo va a estar disponible la opción de descargar libros? Ya lleva mucho tiempo sin funcionar.

[edit]

Hace mucho tiempo que no está disponible la posibilidad de descargar libros creados, pero el aviso dice que el proceso será breve. ¿Qué entienden por "breve"? ¿Cuándo estará de nuevo disponible esa opción. Gracias. 189.148.96.57 (talk) 19:23, 15 October 2019 (UTC)Reply

El trabajo en esta función ha sido descontinuado. No está claro si alguna vez habrá tal función nuevamente. Hay un proyecto de código abierto que intenta resolver el problema de forma independiente. http://mediawiki2latex-large.wmflabs.org/ Dirk Hünniger (talk) 16:26, 16 October 2019 (UTC)Reply

Including Capability for right-to-left Langauges (Arabic)

[edit]

Hello team,

I tried to creat a book by exporting Arabic wiki pages using the PediaPress previw, as it was not possible to download directly as pdf for the moment due to bug. However, I noticed that the whole content is aligned left-to-right, although it comes from a right-to-left language (Arabic). Would it be possible to consider that and use the "open-righ" function as well (if not done yet)?


Best thanks Sky xe (talk) 21:12, 27 October 2019 (UTC)Reply

PDF download is greyed out

[edit]

I did read the warning at the top, but the update said it was working and deployed, so I went ahead and made a book.

I can't choose a download format (greyed out), so I can't d/l a pdf (also greyed out).

And I can't save all my work to my user location of User:Gemlog/Books/ nor to https://en.wikipedia.org/w/index.php?title=Special:PrefixIndex&prefix=Book:

Both produce an API error.

[Xb@-swpAIC4AALGGnJMAAAAS] 2019-11-04 06:05:39: Fatal exception of type "ApiUsageException"


I can, of course, give money to PediaPress. That link works perfectly and the books look amazing.


It would be a wonderful thing if the pdf worked like the July 2019 note says though... Gemlog (talk) 06:09, 4 November 2019 (UTC)Reply

any rendering functionality of books or collection to any downloadable format has been decommissioned. Any funds for any development of a replace or repair of any such functionality have been withdrawn. To say it the German language used by the miners in the area I live in: "Et is im Aasch". I try to develop a free alternative in my free time without any funding. https://mediawiki2latex-large.wmflabs.org/ Good Luck Dirk Hünniger (talk) 06:59, 4 November 2019 (UTC)Reply
Thank you very much for replying me!
The note to the right of this page is extremely misleading to say the least. Well. Now I know not to bother.
However, I may have just learned of a new tool! So there's that :-)
KDE Neon can't find wb2pdf with apt, but I'll find it.
Thanks again!

Gemlog (talk) 07:43, 4 November 2019 (UTC)Reply
The page pdf renderer has been updated and deployed, the Book pdf renderer has been decommissioned. On a Book page this can be misleading, as the "Download as PDF" link only downloads the page and not the whole book. On the other hand, it should not be greyed out and you should also be able to save your new page to your user pages or the Book: namespace as desired.
If your experience differs from this, can you give more precise details?
Another volunteer is writing a new Book pdf renderer and says they will release it as open source for us, but we have been waiting a long time. Steelpillow (talk) 08:52, 4 November 2019 (UTC)Reply
Hi,
I pasted the errors I received into the first post I made ;-) Gemlog (talk) 01:35, 8 November 2019 (UTC)Reply
Also, I see that the misleading box on the right of this page that I was referring to is now gone, so... yay :-) Gemlog (talk) 01:36, 8 November 2019 (UTC)Reply
Still we need more precise information. I cannot find a book "PDF Download" option you say is greyed out. Can you give the url of the page you see it on? Or, is it the "Download as PDF" Print/export option in the lefthand menu (which is for article download, not whole books)? Was it perhaps in the strange misleading dialog that vanished? If you do not tell us accurately where it is, we cannot diagnose it for you!
Again, when you received the error message you pasted, was this in the Book Creator when you tried to save the book? I just created and saved a new book and it all worked fine. Did you add any extra code to your book, such as chapter headings or meta-information? If you post a list of the articles in your book, I can try to see if it will work for me. Steelpillow (talk) 11:30, 8 November 2019 (UTC)Reply
In Book Creator, there is a "PDF Download" option in a box to the lower right that is greyed out and cannot be used. There is really no simpler way to explain it. Guentheralex (talk) 03:34, 13 November 2019 (UTC)Reply
Do you you mean the "Download" box which offers several formats besides PDF? In English, quote marks indicate exact wording. Yes, as I explained above, that is meant to be greyed out.
Otherwise, please post or email me a screenshot to show the option I am not seeing on my PC. Steelpillow (talk) 10:20, 13 November 2019 (UTC)Reply
In English, superfluous pedantry is insulting. Please insert that in your "Download" box. Thank you. Guentheralex (talk) 01:26, 14 November 2019 (UTC)Reply
My apologies, no insult is intended. I suppose that my approach to problem diagnosis is highly pedantic, but I get better results that way. May I take it that you have no problem with this software which remains to be diagnosed. Steelpillow (talk) 10:55, 14 November 2019 (UTC)Reply

mediawiki2latex server now parallel

[edit]

Hi,

the two mediawiki2latex servers are now able to serve requests in parallel. Furthermore mediawiki2latex uses significantly less resources, so even very large books can now be compiled successfully.

https://mediawiki2latex.wmflabs.org/

https://mediawiki2latex-large.wmflabs.org/

Yours Dirk Dirk Hünniger (talk) 21:21, 7 November 2019 (UTC)Reply

Can you clarify, do you mean that the two servers can run in parallel or that each server can run multiple conversion requests in parallel? Steelpillow (talk) 11:03, 8 November 2019 (UTC)Reply
the large server can run up to two requests in parallel. The normal can run up to four requests in parallel. Dirk Hünniger (talk) 17:45, 8 November 2019 (UTC)Reply

mwlib.pdf renderer available on Github

[edit]

Hi,

after some cleanup, we just put the `mwlib.pdf` MediaWiki to PDF renderer on our Github account: https://github.com/pediapress/mwlib.pdf

Unfortunately, the renderer still requires Python 2.7 because it depends on mwlib. Once mwlib has been upgraded to Python 3, it won't be difficult to upgrade the renderer as well. But as you will see, the renderer still requires substantial work.

It's been a very long time since PediaPress released a new renderer so it's very possible that some elements might not work. Please file a bug or share your feedback here if you run into problems.

Cheers

Christoph Ckepper (talk) 15:36, 29 November 2019 (UTC)Reply

mediawiki2latex now with sidebar integration

[edit]

Hi,

mediawiki2latex can now easily integrate into the sidebar on any mediawiki installtion. Just copy the code in https://en.wikibooks.org/wiki/User:Dirk_H%C3%BCnniger/common.js to the common.js in your user namespace on your wiki. Of course you may also integrate it globally by modifying MediaWiki:Common.js on your site.

Yours Dirk Dirk Hünniger (talk) 15:58, 30 November 2019 (UTC)Reply

mediawiki2latex mass procduction / testing

[edit]

Hi,

I am currently running a test on all community maintained books on the English Wikipedia. approx 5000 in total. Currently I got 283 pdf files. In 20 cases no pdf was produced, some pdfs are more that 4000 pages long. If anybody can provide webspace I will happily upload them. We could later link to them from Book namespace in Wikipedia.

Yours Dirk Dirk Hünniger (talk) 16:07, 30 November 2019 (UTC)Reply

I uploaded the first 100 resulting pdfs.
https://drive.google.com/drive/folders/17g5Ey6jauKd3CLMDNBOnV3RYKcJN0QZu?usp=sharing
more will not work due to a lack of webspace Dirk Hünniger (talk) 21:03, 1 December 2019 (UTC)Reply
I uploaded more than 500 pdfs with images stripped to work around the limited webspace. See here:
https://drive.google.com/drive/folders/16bB74paEczU_NEpaNfEpU2gjjhBlKBtJ?usp=sharing
Currently I got more than 1000 pdfs on my local disc. Those also contain images. Currently the chance that mediawiki2latex fails on a community maintained book is about 6.5%. merry xmas Dirk Hünniger (talk) 21:29, 24 December 2019 (UTC)Reply
From the statistics gained from this experiment I deduced that a full rebuild of all community maintained books will take less then a month and cause server costs of 320 EUR. https://de.wikibooks.org/wiki/Benutzer:Dirk_Huenniger/wb2pdf/manual#Wikipedia_Books Dirk Hünniger (talk) 19:39, 29 December 2019 (UTC)Reply
as request for storage to upload the pdfs has been filed
https://phabricator.wikimedia.org/T241584 Dirk Hünniger (talk) 22:02, 30 December 2019 (UTC)Reply

tachyon continues

[edit]

Hi,

I followed the idea to look into possibilities for speeding up mediawiki2latex. I came up with C code based on html tidy, which needs more than a factor of 14 less wall clock time than mediawiki2latex.

See here https://de.wikibooks.org/wiki/Benutzer:Dirk_Huenniger/wb2pdf/manual#Performance_Considerations

The resulting file is here:

b:de:File:TachyonTest.pdf


Yours Dirk Dirk Hünniger (talk) 21:00, 22 December 2019 (UTC)Reply

Thank you very much for your work on mediawiki2latex. I really like it, and the output looks amazing. MavropaliasG (talk) 03:39, 23 December 2019 (UTC)Reply
This looks promising. Could you automate the whole algorithm: build the book in C/Tachyon, then generate the contributions list and append it with Haskell/mw2latex? That is, use modules in C as separate accelerators for the overall Haskell process? Steelpillow (talk) 12:04, 24 December 2019 (UTC)Reply
No, that's not the route to go. 50% of the runtime of mediawiki2latex is used for the generation of the lists of contributors and figures. This is because the information is extracted from the pages histories of all pages and all images in the book.
The way to do this right is to directly query the SQL database. As far as I understand WMF allows for that. But the is a problem: my second name is Hünniger which contains the letter ü, which is impossible for the horizon web interface of WMF. So I am sorry to say that this is not possible until new software get installed on horizon, which might take years, since the problem already exists for years.
Another 30% of the runtime of mediawiki2latex is needed for the images. In mediawiki2latex I download the images in the maximum possible resolution and scale the down to 300 dpi and include these rather large images in LaTeX which causes the LaTeX compiler to spend more than half of its total runtime on images, since it is doing a time consuming recompression when embedding the images, which cannot be changed according to the German LaTeX mailing list.
In tachyon I download images in a much lower resolution and don't do any image processing which speeds up things significantly. So a lot of the speedup in tachyon actually come from leaving out contributor information and using lower resolution images.
The actual runtime of the tachyon C code accounts for less that one percent of the actual runtime of a single tachyon run. The rest of the time is needed by wget to download to images and html pages and by xelatex to create the pdf. So tachyon is more an experiment to measure a lower bound to the runtime of such a conversion than the future route of mediawiki2latex.
So to wrap it up. Moving from Haskell to C can make that part of the program significantly faster, but even if we did that we would only affect 20% of the total runtime, since the rest of the runtime is spend by auxiliary programmers, not under my control. So what we can get at most is a 20% speed up when porting to C. If we did direct database querys we will get a 50% speed up for free. An other way is using lower resolution images, but I doubt many people will be happy with that. But yes, if all these issues are solved, we could still go to C. Dirk Hünniger (talk) 12:35, 24 December 2019 (UTC)Reply
Thank you for the explanation.
The umlaut issue is just the kind of reason why alias accounts are sometimes allowed. Would it help if you created a second account using the "Huenniger" spelling? Steelpillow (talk) 13:10, 24 December 2019 (UTC)Reply
yeah this is likely to help. But it would require some administrative artistry to get the right privileges for that new account. I have not yet decided if I really want to do that. But basically that is a possible solution Dirk Hünniger (talk) 13:18, 24 December 2019 (UTC)Reply

book generator removed on wikipedia

[edit]

The book generator has been remove from the English Wikipedia

https://en.wikipedia.org/wiki/Wikipedia:Village_pump_(technical)/Archive_176#Suppress_rendering_of_Template:Wikipedia_books Dirk Hünniger (talk) 13:50, 31 December 2019 (UTC)Reply

It has not been removed, but user interface links to it have been and notices with incorrect statements added to many pages. Steelpillow (talk) 16:02, 15 January 2020 (UTC)Reply