Reading/Web/PDF Functionality/fr

Update on books, August 17 2018


Here is an updated and more comprehensive sample of the new book renderer. The layout changed quite a from the first version presented at Wikimania. Thanks for all the feedback. The export still has a number of significant issues: page breaks, infoboxes, tables, and math formulas need to be improved substantially. This sample file focusing on international scripts and math formulas reveals some of the problems that still need to be solved. Math formulas are currently rendered using MathML - switching to LaTeX should lead to significant improvements.

Update on books, August 8 2018
We have been working with PediaPress on generating and styling the new books. They have provided us with a sample of the current output, which will be very similar to the final version. We discussed points of improvement with the PediaPress team, which they are addressing currently. If you have any feedback or other comments on these samples, please let us know on the talk page.

Update on books, April 2018
Books functionality will be returning via PediaPress. After investigating the new renderer in depth, we realized that core features of the original book creator (such as page numbers and table of contents) would be very difficult to implement using the new renderer. In addition, we had significant issues with our concatenation code. Thus, we had to look for alternatives in terms of bringing back the PDF books functionality on Wikimedia projects. We reached out to PediaPress, who were the original patrons of books on Wikipedia to see if they would be interested in taking up PDF rendering for books once again. They have agreed and we are currently working on the details and schedule. They will start by working on a temporary solution based on an older technology that has previously been used to create PDF. This might have some drawbacks when it comes to graphical elements, such as maps, but will mean a faster working solution. They then plan to work on a new HTML-to-PDF renderer afterwards, based on feedback on the first implementation.

Update January 2018
We're currently preparing performance tests of the PDF to book function. We should know more in early February.

Actualisation de septembre 2017
Notre actuel service de rendu en pdf, l'offline content generator (OCG) ne sera plus maintenu et cessera de fonctionner. L'équipe reading team de la fondation Wikimédia a travaillé plusieurs mois pour le remplacer. OCG, créé à l'origine comme une implémentation tierce partie, fonctionnait avec un code obsolète susceptible d'introduire des vulnérabilités et d'autres problèmes majeurs dans le futur. Durant les trois derniers mois, nous avons placé des bandeaux sur la page de création des pdf afin d'obtenir de l'information en retour concernant le prototype de notre nouveau moteur de rendu. Ce dernier aura des fonctionnalités améliorées, il pourra rendre les tableaux et les infoboxes et utilisera des styles visant à une meilleure lisibilité. Nous avons réuni une bonne quantité de commentaires positifs à propos du prototype et nous travaillons pour incorporer les actualisations nécessaires pour nos nouveaux pdf.

Later addendum: Turning PDF book rendering OFF for the short term
Malheureusement, les problèmes majeurs que rencontre notre ancien moteur de rendu (OCG) obligent à le supprimer avant de pouvoir terminer la fonction de création de livres. Cela se produit plus tôt que nous le pensions. Au moment où nous supprimerons OCG, le travail nécessaire pour la création de fichier à partir d'un seul article sera terminé. Néanmoins, il manquera quelques fonctionnalités au rendu de livres, telles que les feuilles de style et la table des matière. Nous allons travailler pour que cela soit disponible dans les mois à venir et espérons proposer des fonctionnalités complètes en novembre-décembre 2017.

Timeline:


 * Release of full-featured renderer for single articles (print to pdf) – Oct 1, 2017
 * Pausing book PDF rendering – Oct 1, 2017
 * Sunsetting of OCG renderer – Oct 1, 2017
 * Release of new PDF renderer – Jan, 2018 (tentative based on research results into alternative rendering systems)

Functionality:

For a full list of current and upcoming functionality, see below.

In addition to this page being updated, this will be communicated in a banner on PDF creation page, in Tech News and on some Wikimedia mailing lists.

Introduction
Notre moteur de rendu PDF actuel, le offline content generator, n'est plus maintenable. En termes simples, il est en panne. Créé initialement par un tiers, il fonctionne actuellement avec un code obsolète qui pourrait introduire des vulnérabilités de sécurité et d'autres problèmes majeurs à l'avenir. Si nous voulons conserver la fonctionnalité de création de PDF, nous devons le remplacer, ou nous pourrions soudnainement nous retrouver dans une situation où nous devrions le désactiver sans avoir une alternative.

Additionally, it does not support a number of rendering requests from the community, the main one being the ability to render tables. We have selected a new service, the electron rendering service, as a suitable replacement. Our next step is to duplicate the functionality provided by OCG using the electron rendering service. Below, we will describe the main portions of the functionality we have identified as necessary. We would like to invite conversation around what is missing or what is superfluous in the provided list. We would also like to highlight over our future plans for PDF rendering to gather initial feedback.

Userbase
The following table shows a sample of traffic to the Electron "Download as PDF" service for over a 6 hour period. The traffic is broken down by operating system (OS), browser, and the browser major version (e.g. Windows 7, Chrome v61.*).

Note well that the majority of our traffic appears to come from Windows based machines.

Exigences de la fonctionnalité actuelle
Voici une liste des exigences actuelles pour le rendu en PDF des documents PDF en un seul article et pour les livres. Les exigences différentes de l'implémentation actuelle sont affichées en gras.

Historique

 * Le rendu des articles en PDF et des livres provenant des pages de Wikipedia est géré par un service appelé OCG. Lorsque vous créez des "livres" à travers le créateur de livre, il utilise OCG intégré dans l'extension Collection. OCG a plusieurs problèmes, en particulier avec des tableaux.


 * Multiple issues with OCG are identified, including complaints from the community around OCG's inability to render tables.
 * Rendering of tables ranks as number 9 on the German-speaking Community Technical Wishlist.
 * Wikimedia Deutschland begins on working on a solution for rendering tables in PDF's, and introduces Electron. They do this planning to run it alongside OCG, not to replace it.
 * At the same time as Wikimedia Deutschland is working on the Electron service, the responsible maintainers of the OCG service at the Wikimedia Foundation come to the conclusion that OCG has to be replaced.
 * The WMF Reading Team takes over the responsibility for the long term maintenance of PDF rendering begins plans on implementing table rendering across all projects.
 * The Reading team launches a community consultation for gathering feedback on Electron.
 * The Reading Infrastructure and Web teams begin scoping the working necessary to port OCG functionality over to the Electron service.

Update After Consultation
We launched a consultation on the current implementation of the PDF renderer in early June, 2017. After reviewing the consultation responses, we have made the following observations:


 * A larger number of users preferred the single-column format over the double column format
 * Users which prefered the double-column format highlighted that their preference was based in the styling and look and feel of double columns. Some users also expressed concerns with font size and wasting paper when printing PDF's in the single-column option
 * The following feature requests were made:
 * Functional hyperlinks
 * Date and url, 'this page downloaded [date] from [URL]'
 * Customizable css for layout, title, TOC
 * Option for 2 column format
 * Include/exclude images versions
 * Modifiable margins
 * print by section - allows you to remove references, paragraphs you don’t want, index, etc
 * allowing configurable text size

Based on the feedback, we have incorporated the following into our new print styles:


 * hyperlinks
 * article information
 * smaller font and book-like styling

The remainder of the requests above will be postponed until the second iteration of the PDF renderer, in which we plan to build a settings mode that will allow for customization of the available options.

Proposition
Voici une proposition pour la portée des fonctionnalités nécessaires au rendu PDF :


 * Individual articles will be rendered to PDF using the "Download as PDF" link in the sidebar
 * Multiple articles will be rendered to PDF using the Book Creator tool
 * All articles will contain attribution for text and images
 * All PDFs rendered will be able to print tables
 * Users will be able to customize the layout of their PDF (optional)

Design
The new PDF styles will be designed for increased readability. Based on community feedback and qualitative or quantitative testing, support for a 2-column layout may be built for the book creator and/or for individual PDFs.

Feuille de route de développement et de déploiement
Voici une description détaillée de la feuille de route de développement et de déploiement. Elle est sujet à changements.


 * 1) April – May 2017:
 * 2) The Reading team builds back-end support for functionality identified above
 * 3) Communities are consulted on expanding or shrinking proposed functionality
 * 4) Qualitative test performed for styling
 * 5) June – July 2017:
 * 6) New styles implemented
 * 7) First iteration is launched along with OCG on all projects and performance is compared
 * 8) Iterations based on consultations and identified edge cases
 * 9) August 2017 – September 2017
 * 10) Additional changes made if necessary
 * 11) October 2017
 * 12) Second iteration launched without OCG on all projects

Articles individuels

 * Un PDF pour un seul article sera créé en sélectionnant le lien "Télécharger en PDF"
 * Upon selecting "Download as PDF", the PDF file will be generated. To download the file, users will select the "Download the file link"
 * Each PDF file will contain the following:
 * Article title and text
 * Infobox (if any)
 * Tables (if any)
 * Single-column layout
 * Page number
 * All article images and captions
 * Links to pages linked from the article (blue links and external links)
 * Text and image sources, contributors, and licenses

Phabricator Tracking
All PDF-related changes including sunsetting OCG, replacing the Electron PDF renderer, and any updates to books or the collections extension are tracked under the phabricator project Proton. The project page will display any recent updates for all tasks related to PDFs.

Livres
Remarque : aucune modification ne sera apportée au flux de travail courant du créateur de livres à l'heure actuelle


 * User will launch the books creator by selecting "Create a book"
 * This will navigate to the current book creation page
 * To download a book, users will select the "download" link from the books page
 * Users may only download books in PDF format
 * Books will contain all elements from single article format as well as:
 * Book title page
 * The references for each article from the book will appear at the end of the article
 * Each article will begin on a new page
 * A single section for text and image sources, contributors, and licenses, that contains the collected contributions from all articles

Functionality available in November - December, 2017
Styles for books will be updated for improved readability
 * Books will contain a table of contents with page numbers
 * Selecting a section from the table of contents will navigate the user to the corresponding section within the book

Alternative
There is an alternative way of exporting MediaWiki to LaTeX, PDF, ODT and EPUB:

http://mediawiki2latex.wmflabs.org/

The computational resources on the server are limited.

If you run Ubuntu Linux and want results faster, you can install the m2l-pyqt or mediawiki2latex packages.

__INDEX__