Topic on Talk:Reading/Web/PDF Functionality/Flow

Steelpillow (talkcontribs)

It is now nearly three months since the last update and test document. Would it be possible for PediaPress to give us a progress report?

Johan (WMF) (talkcontribs)

PediaPress has been working on some LaTeX stuff lately. @Ckepper, is there anything else to report?

Ckepper (talkcontribs)

As Johan reported, I have been quite busy with other stuff (like paid work). The last thing I worked on was LaTeX support (See Maxwell's Equations and Schrödinger equation). This had some detrimental effects on image scaling - see pages 11 and 15 in the Schrödinger article - that I have not yet been able to fix. Moreover, it seems like some formulas are not rendered correctly at all: check out the Formulation in SI units convention for an example - the scaling of the formula seems completely broken and I am lacking the LaTeX skills to fix this. This could be a problem wit my local LaTeX distribution, or local settings or within the LaTeX formula itself. Any insight into how this could be fixed would be highly appreciated.

The last big piece that I haven't touched at all are tables. With the original PediaPress renderer, this was the most complex part but maybe we can start small and expand the feature-set. I would rather focus on setting up a basic render server so people can start using it.

125.253.56.226 (talkcontribs)
Ckepper (talkcontribs)

PediaPress created a LaTeX renderer 10 years ago that is still in use today. The biggest challenge we faced with rendering Wikipedia was the heterogeneity of the content. Most editors stop working on an article when it looks as intended in their browser. Semantic or "clean" markup are not of particular interest. You can fix this for a small number of pages (or extensions) but it becomes unsurmountable when you want to address all of Wikipedia.

We decided to use a HTML / CSS Paged Media based renderer because this approach creates the least amount of friction between on-screen and print content for the majority of the pages. Other projects and more qualified engineers might find better solutions but I will continue on the current path.

Steelpillow (talkcontribs)

Another problem is that mediawiki2latex is written in a language called haskell, which not many programmers know. It would be difficult to ensure flexible support options in the future, which is the very problem which brought us to today's sorry state.

Dirk Hünniger (talkcontribs)

Hi,

I am the developer of mediawiki2latex. I currently work on a paid project for the German government, so my time is quite limited. It is even hard for me to take part in my Aikido classes.

I will try to keep the mediawiki2latex website up and running as well as maintain the mediawiki2latex Debian package for as long as I can. But I currently don't have any time to add new features and it is unlikely that I every will. A similar project also written in Haskell is called pandoc. Its very actively developed. But still I think you currently do get better results with mediawiki2latex.

I also recommend to learn Haskell. Probably you will not use it at work, but the skills you learn by doing Haskell are very helpful in day to day work as software developer or scientist. It a bit like studying maths at university. You will never need to work with these abstract objects as such, but the skills to solve problems in abstract manner are really handy.

Yours Dirk

Ckepper (talkcontribs)
Steelpillow (talkcontribs)

Thank you, this is showing real progress.

Reply to "Progress?"