Topic on Talk:Reading/Web/PDF Functionality

Jump to navigation Jump to search
Steelpillow (talkcontribs)

This is getting truly ridiculous. I am embarrassed for this Foundation from Jimbo down. Once again we have heard absolutely nothing - nothing - save a few false promises of announcements that never get made, of deadlines that are as believable as the flying spaghetti monster and pass into history quicker than a mayfly in summer. What is going on with this PDF book renderer, then? Where is the publicly accessible repos of this supposedly open-source code? Where is the opportunity for other developers to do the open-source thing and help the code along? Will somebody please, please, please tell us WTFlip!

Jdforrester (WMF) (talkcontribs)
Steelpillow (talkcontribs)

Thanks for the kind thought but the Chromium solution has been abandoned for books (being kept only for articles) in favour of the PediaPress solution. If you can find a link for that, you will indeed be my hero.

Jdforrester (WMF) (talkcontribs)

That's interesting, and the first I've heard of this. Can you provide a link?

TheDJ (talkcontribs)
Jdforrester (WMF) (talkcontribs)

Oh, curious.

Dirk Hünniger (talkcontribs)

as detailed on the above linked page, a new open source pdf renderer is going to be provided by PediaPress. I would like to look at the source code to see if there is a dependency to the mwlib library (which is also developed by PediaPress). Such a dependency would cause a stoppage of security updates due to the decommissioning of Python 2 on 1st January 2020, rendering the system undeployable.

Johan (WMF) (talkcontribs)

Last we spoke to PediaPress, they were aiming for end of the calendar year. I've pinged them in email to let them know questions are being asked about the books-to-PDF functionality.

Steelpillow (talkcontribs)

Thank you Johan. However this is not my first request, or your first acknowledgement, since the end of that calendar year. Could you also ask them to give details of the repos for their new renderer, so that we can confirm it is open source and can see what is going on without pestering the developer unduly? Dirk has already asked in another thread below here, but has not been answered.

Johan (WMF) (talkcontribs)

Yes, I mentioned that too.

Johan (WMF) (talkcontribs)

But I'd like to remind everyone that if you've got a beef with how long this is taking, the only sensible target of that is our (the WMF) priorities in not assigning resources to this particular functionality, not volunteer developers.

(We think it makes sense, of course, looking at what we'd have had to defund otherwise, or we wouldn't have prioritised that way. But still.)

Steelpillow (talkcontribs)

My beef with the current developer is not the length of time, which they give freely, but the lack of visibility of what is supposed to be an open-source initiative. They really ought to be giving that visibility freely, too. Both periodic communications, even if only "Sorry, I've been busy elsewhere this quarter", and sight of the repository would help a lot. For example a developer going quiet usually means a developer not making progress. And sight of the architecture is pretty darn important if the WMF want to de-risk yet a third fiasco in a row. Do you?

Ckepper (talkcontribs)

I am not sure if I missed a previous post but I don't have any intention to be secretive on purpose. The past quarter was indeed really busy and the next two will most likely be as well. Nevertheless, I intend to continue and finish the project. Part of the project relies on 10+ year old PediaPress infrastructure that needs to be upgraded before doing the next steps. One of the servers was already upgraded two weeks ago but we need to setup an additional new render server for the project and stabilize the whole render process. This needs time for investigation and fixing and I don't know when I will find this time. I am sorry about the delays.

Steelpillow (talkcontribs)

@ckepper Thank you for the update. Is the code that you have written open-source? Is the repository accessible to third parties? If not, are there any plans for that? Steelpillow (talk) 07:42, 12 April 2019 (UTC)

Ckepper (talkcontribs)

No, the code is not open source. Since we are using a commercial rendering component and plan to run the service on our own infrastructure, we didn't really see a need to open source it. But this might change of course - if people have a long-term interest in contributing to the project or if PediaPress could no longer operate it...

Also, PediaPress potentially might offer customized PDF rendering for non-WMF projects (think enterprise wikis) as a commercial product. Open sourcing this project would expose many of our ideas and eventually let us loose our edge in this field. However, especially open source often work very successfully with such a model. As you can hear, this is not an easy decision for me and I have to think about it.

What are your reasons for asking to open source the project?

Steelpillow (talkcontribs)

This on the home page of the Wikimedia Foundation: "From site reliability to machine learning, our open-source technology makes Wikipedia faster, more reliable, and more accessible worldwide." and on the linked Technology page: "We keep Wikimedia projects fast, reliable, and available to all." I do not understand how this vow can be honoured if the book creation function is closed source.

Also of course, if you guys shut up shop for any reason (these things happen), then who is to maintain our book renderer, and how?

Perhaps one of our WMF participants such as Jdforrester (WMF) or Johan (WMF) could answer these points? Steelpillow (talk) 12:46, 12 April 2019 (UTC)

Jdforrester (WMF) (talkcontribs)

I'm entirely uninvolved in this project. You showed up spreading profanity and I tried to help you, but I failed because you hadn't explained your concern and I was answering the wrong question. Sorry.

Steelpillow (talkcontribs)

It seems that you have no intention of clarifying the WMF's position. Let us hope that somebody a little more civil will be able to.

Johan (WMF) (talkcontribs)

@Steelpillow, as @Jdforrester (WMF) says, he doesn't work on this project, and that means he's got about the same level of access to information as you have. He's not obfuscating. We're not a monolith. There's no internal collection of decisions.

I've pinged the person best suited to reply to you and pointed them to this thread.

Johan (WMF) (talkcontribs)
Dirk Hünniger (talkcontribs)

Hi,

to my understanding the task to create PDF versions (as well as a few other file formats) of Wikipedia Books is currently solved by mediawiki2latex, which is currently functional, which is open sourced, and only depends on open source components, which can be understood be seeing that it is part of Debian, which may be used online, but may also be installed locally on the most common operating systems, of which can easily assured that it does not depend on python 2 by looking at the source code.

Furthermore it is not clear whether or not a functional alternative will be developed, if it is going to be open sourced, if it will be possible to install it locally, or if it will depend on any non open source components, or if it will depend on python 2 since the source code is currently not available.

So to me it is clear that it is necessary for me to keep on developing mediawiki2latex, although I know about other things I could do in my free time.

Yours Dirk

Steelpillow (talkcontribs)

Thank you Dirk. When we consider that your Haskell implementation was refused due to the suggested problem of finding programmers able to provide alternative support, the adoption by WMF of a closed-source core for which alternative support is genuinely impossible, becomes less easy to understand. Could the WMF please explain their thinking on this a little more fully than they have done in the past? Steelpillow (talk) 17:38, 12 April 2019 (UTC)

TheDJ (talkcontribs)

Causation and correlation. Both are expensive for the foundation. One just a little more so.

I think that is pretty much the point here. Rendering HTML to books at the level of quality our community would expect, is just not something the foundation wishes to spend its money on. Thats a valid pov, even if a few people disagree about it. Considering pediapress also doesn't seem too keen to throw lots of money at it again, that isn't totally crazy.

Steelpillow (talkcontribs)

OK, so we now know that the books project is on a very slow train to a proprietary solution that cannot be maintained by the community, and the WMF are happier with that than any other option open to them.

I am surprised to find that the maintainability issue, which killed both the old OCG and adoption of Dirk's Haskell solution, is no longer seen as an issue worth addressing. Thank you for making that plain.

My thanks too to all those who have given constructive replies to the questions I raised.

TheDJ (talkcontribs)

"the WMF are happier with that than any other option open to them" well happy is overstating I think. The foundation has limitations and needs to make choices. See also why they halted much of the work going into Maps and Graphs at some point. There is a considerable over ask and an eternal backlog of problems and wishes that could be worked on. To handle all of it, you could probably higher a 2500 people and still only barely keep up. Contrast this with the fact that people already complain that WMF spends too much money on developers and programs and you have identified the primary problem in my point of view.

MJL (talkcontribs)

@Ckepper, hey listen. It's important you try your best. Thank you for the good work you do, and please don't be discouraged if people get frustrated. We're all only human.

I appreciate this response! :)

Reply to "WTFlip?"