Topic on Talk:Reading/Web/PDF Functionality/Flow

out of the frying pan and into the fire

10
BerlinSight (talkcontribs)

While I appreciate that WP got rid of the two column layout and the lack of tables in PDF output, I miss the quality of the LaTeX generated PDFs. The typographical problems already are discussed but there is another issue. Even vector Images like (used on page "Spiral model", inserting the link doesn't work here "page does not exist") are included in a rasterized version. The resolution is appropriate only for screen display but generally too low, to e.g. read contained text in the printout. Thus they often are useless.

TheDJ (talkcontribs)

This is a known problem, tracked as T178664

BerlinSight (talkcontribs)

OK, I see. Yet I think WP is wasting time reinventing the wheel. TeX/LaTeX is a professional quality typesetting system, which is Free / Open Source Software and contains decades of work. My prediction is Electron will never reach the point TeX already is WRT typographic quality. IMHO leaving the user with two options for generating PDFs (one with high quality typesetting and one with tables) would be a better choice than the current status.

TheDJ (talkcontribs)

I think the problem here is that the systems are fundamentally different. HTML is made for flexible and dynamic layouting, adapting to any situation where it is asked to render and (these days) a lot of interactivity.

LaTeX is fundamentally designed for very reproducible and specific layouting in controlled circumstances, mostly for non-interactive situations. You can't make websites with LaTeX (it's hard to put jello into a straightjacket), and therefore you cannot print them with it either. And HTML cannot do what LaTeX can do.

BUT HTML is catching up. There are specs for adding print specific context (page size, pagebreak info, etc) to HTML for instance, but they are not yet supported. It's also a technology that is closer to what we are used to within our own ecosphere, making it easier to support for the engineers that have to do the incidental work to support it, and we have to duplicate less work in both stacks, since most of the time, the easy stuff will just work.

Neither is perfect, neither will be perfect, but one is sustainable for us, and the other is not.

BerlinSight (talkcontribs)

Sorry, but it looks like you are missing the point completely. I did not ask to rewrite WP in LaTeX. The former PDF engine used LaTeX as a backend to create high quality PDFs, alas lacking tables. As the new engine has tables but an awful typographic and image quality and quite likely will never match the output quality of the old PDF engine, I would prefer to have the choice, which one to use (or better the old one with tables and single column layout, but that does not seem possible).

TheDJ (talkcontribs)

I was just talking about the technology stack:

  • normal: wikicode -> html
  • old engine: wikicode -> LaTeX -> PDF
  • new engine: wikicode -> html -> PDF

We removed one very expensive translation step from the system, that had no maintainers and no experts available that were able to keep it online.

THAT is the only thing that matters. It's a resourcing decision. If you want to quit your existing job and for free improve the old system, then that's fine.

Dirk Hünniger (talkcontribs)
Debenben (talkcontribs)

@Dirk Hünniger great work!

I am also disappointed by the typographic quality of the chromium rendering engine. Especially mathematical formulas look horrible (). I did not know about mediawiki2latex, why don't we mention it as an alternative and let the user decide what they prefer?

Debenben (talkcontribs)

I tested the claim that it can handle tables on the article schwarzschild-metric which was mentioned somewhere below:

mediawiki2latex -m -g -u https://de.wikipedia.org/wiki/Schwarzschild-Metrik -o "schwarzschild.pdf"

result: All tables are rendered perfectly. Mathematical formulas look perfect, only one drawback: some urls don't get any line-breaks, so they sometimes extend beyond the page margins

Quiddity (WMF) (talkcontribs)

Posting to bump cache, and hopefully fix missing comments.

Reply to "out of the frying pan and into the fire"