# Requests for comment/Reduce math rendering preferences

Request for comment
Reduce math rendering preferences
Component General
Creation date 2011-07-21
Author(s) Brion Vibber
Document status implemented

## Proposal

This is a requests for comment about reducing math rendering preferences.

RFC closed and accepted. Options eliminated but still need to add baseline shift per https://bugzilla.wikimedia.org/show_bug.cgi?id=32694.

### Background

Bug 24207 requests switching the math rendering preference default from its current setting (which usually produces a nice PNG and occasionally produces some kinda ugly HTML) to the "always render PNG" setting.

I'd actually propose dropping the rendering options entirely...

• "HTML if simple" and "if possible" often produce horrible ugly output that nobody likes, so people use hacks to force PNG rendering. Why not just render to PNG?
• "MathML" mode is even MORE limited than "HTML if simple", making it entirely useless.
• nobody even knows what "Recommended for modern browsers" means, but it seems to be somewhere in that "occasionally crappy HTML, usually PNG" continuum.

So we're left with only two sane choices:

• Always render PNG
• Leave it as TeX (for text browsers)

Text browsers will show the alt text on the images, which is... the TeX code. So even this isn't actually needed for its stated purpose. (Hi Jidanni! :) lynx should show the tex source when using the PNG mode.)

The feedback I've received so far indicates that the 'leave as tex' option is mostly used in concert with gadgets or user scripts to run MathJax rendering.

But the immediate fix of removing those extra unwanted options seems like it can be an easy win to reduce complexity and inconsistency in the math rendering behavior.

### Supplementary possibilities

Integrating MathJax-style rendering automatically in supported browsers could be useful, and might eliminate the need to keep the 'leave it as tex' option.

### Not covered

Full core integration of alternate rendering technologies (eg replacing Math + texvc with Wikitex etc) are not considered at this time.

But collecting specific information to make a future change decision is useful!

• texvc currently cannot send baseline information which is necessary to properly position the image relative to text
• -> future image rendering should either improve texvc or replace it with a tool that already does this
• blahtex PNG rendering apparently has this
• blahtex MathML rendering should also be nicer
• -> reconsider blahtex in more detail at some point!

### Implementation

Key:

1. remove the following options from math preferences (done in r104498):
• 'HTML is very simple or else PNG'
• 'HTML if possible or else PNG'
• 'Recommended for modern browsers'
• 'MathML if possible (experimental)
2. Have all of those options, where already present, fall back to the 'Always render PNG' option.

Possible secondary:

1. remove the math rendering preferences entirely, and always send the PNG image
2. create a common gadget, extension (Extension:MathJax?), or built-in feature to enable MathJax or similar rendering to replace the image alt text, letting people transition to that mode (started on r104521)

## say what is 'Recommended for modern browsers'

2

I don't care what you do, but remember to say

BLA_BLA (Recommended for modern browsers)

not just

Recommended for modern browsers

else nobody can figure out what kind of medicine you propose to slip in their drink. Yes, one line should be suffixed "(default)" or "(recommended)". OK, you can say

let mother decide

or

site default

or something, but just don't say

Recommended for modern browsers

intemixed with nouns on the other lines.

This post was posted by He7d3r, but signed as Jidanni.

Yeah we're just going to kill that option... nobody knows what it means. ;)

This post was posted by He7d3r, but signed as Brion VIBBER.

Reply to "say what is 'Recommended for modern browsers'"

## iTex

1

in the current discussion on the Math 2.0 forum the favored tool for converting latex to mathml safely is iTex.

This post was posted by Lee Worden, but signed as Wonder.

## 'HTML if simple' should stay

1

And it should stay as default, if only for the sake of inline formulae. The HTML is not ugly, in fact, it is pretty straightforward. With proper CSS, as on enwiki, it is even rendered pretty decent. It is only the 'HTML if possible' that is ugly (using tables without a class).

Reply to "'HTML if simple' should stay"

## SVG rendering

1

I can't tell if the other discussion section is trying to say the same thing or not, but simply replacing the PNG renderer with an SVG renderer would be a good thing. As noted elsewhere, PNG is rather ugly in some (many?) cases.

## Short-term best fix?

7

Based on feedback I've seen so far, it seems like the best short-term fixes will be along the line:

• modify texvc to provide baseline information to fix PNG positioning
• this should reduce the worst problems with PNG math fitting poorly with inline text
• disable the old partial HTML & MathML modes
• keep the 'Leave as TeX' mode to be used with existing user scripts for MathJax

with medium- and long-term:

• give blahtex another look (PNG and proper MathML output to replace texvc?)
• see about cleaner integration of MathJax, with automatic fallback to PNG
• track down a few more bugs with MathJax rendering before making it default

Anything else that should modify those priorities?

Support. Once we have a 95% solution on the PNG baselines, I think we can scrap the partial HTML and MathML modes.

Finally got back to working on this topic... I've removed the extra options (won't be live until 1.19 deploys, current target late January or Februrary). Baseline adjustment isn't in yet, but is in the works. Also started playing with MathJax, will write up some more notes as its own RFC in a bit.

Sounds good to me.

OT: Would it be possible to upload the MathJax web-fonts to the Wikimedia servers somewhere? Currently I am requesting users to point their web-font configuration to the MathJax CDN, but that is not intended to deal with high volumes (currently no problem, but possibly in the future... not that web-fonts would result in high traffic anyway.)

We should be able to get the whole library running from our optimized static-file server bits.wikimedia.org; once enabled by default it may well be highish traffic. :)

I'd rather not mess with it until we've got MathJax integrated in with Extension:Math as a progressive enhancement though; it'll be easier and more reliable to get it deployed if it's bundled together there.

Support.

To short term aim I'd also see if anything could be done about font-size matching with the texvc PNG.

MathJax can also produce MathML, not sure how the quality of this compares to the BlahTex output.

MathJax essentially uses MathML for its internal representation, so output is pretty much only dependent on the support of MathML by the browser. Even though FireFox has the most complete support of MathML it is still lacking in some regards.

## LaTeXML

1

LaTeXML may also be a possibility: produces good PNG and MathML. Is being actively developed. Can run as a daemon to convert latex to rendered math efficiently. According to the author, does not currently produce baseline information but could be modified to do it.

This post was posted by Lee Worden, but signed as Wonder.

## BlahTeX with SVG rendering or MathJax

3

Both BlahTeX with SVG rendering and MathJax are sensible options. The former would imply server-side rendering, and would/could resolve many current issues such as pixelated fonts, missing baseline settings, and font scaling with user/browser preferences. It is supposed to have better HTML/text and MathML output capability than texvc. However, additional intelligence in the server-side processing would be needed to tell apart display math from inline math tags. Moreover, automatic adaptation to the surrounding text size in a HTML document (e.g., in References, which are often in smaller font) may not be feasible. MathJax relies on JavaScript-based client-side rendering, would also resolve these issues, and additionally supports automatic text size adaptation and copy&paste. It has wide browser support and is actively developed.

Both options would not be fully compatible with the current texvc rendering, which is because texvc is quite loose in the parsing of TeX and accepts constructs that are invalid in TeX or (most) (AMS-)(La)TeX document classes. Furthermore, texvc implements a set of non-TeX symbols (inclusive typos). However, with some additional coding effort for backwards compatibility with texvc and some user efforts it should be doable to get it right in the end (as done in part in my mathJax user script front-end).

When we first started thinking about BlahTex many years ago, we did have a major effort by wiki-project mathematics to clean up the tex input and fix inconsistencies which didn't render correctly in BlahTex. Things like having a raw % sign in math tags which is illegal latex but texvc allows. w:User:Pfafrich/Blahtex en.wikipedia fixup documents some of this effort. Members of the project were very good and helping with this clean up.

Thanks, that link is quite useful for tracking down some incompatibilities between MathJax and texvc.

Reply to "BlahTeX with SVG rendering or MathJax"

## Leave as TeX

3

Are there other legit uses for the 'Leave as TeX' option? The TeX source is still available as the alt text on the <img> tag, so custom scripts manipulating it can use that.

This post was posted by He7d3r, but signed as Brion VIBBER.

This assumes that (1) HTML rendering is dropped (e.g., no more "HTML if possible"), and (2) it is considered acceptable to download images just to replace them shortly after, e.g., by MathJax output.

This option shouldn't be removed, since it has a significant and well-defined functionality, which doesn't hurt anyone and is useful to a few people.

When i replied about it on wikitech-l, i knew one university professor that used it. Since then i found one more such person.

In fact, you can just check how many people actually use it (in each of the 800+ projects). If it's more than zero, keep it.

## Default setting

8

IMHO, the default setting should encourage authors to use the <math>-tags whenever they typeset math: <math>2^x</math> is a standardized way to typeset math (as opposed to HTML where you could write the same as 2<sup>x</sup> or using a styled <span> tag etc.) and thus easily machine-readable, i.e. can be automatically adapted to a new rendering method, if ever implemented. Even now, many authors avoid using the <math> tag even for simple formulas due to the previously stated $\text{ obvious problems with inline PNG rendering}\,$. Making PNG rendering the default setting would make things worse. I'd suggest you either resolve the baseline/size problems prior to making this change or boost a promising alternative such as mathJax.

This post was posted by He7d3r, but signed as Pberndt.

Making PNG rendering the default setting wouldn't make things worse as HTML formatted math would stay as such. A main problem with the "HTML if possible" options is that math tag HTML rendering is horribly broken.

The change would encourage authors to favor manual HTML-formatting over <math> tags for inline formulas. From a semantics point of view, that's worse.

Indeed.

From a semantic point of view, the current part-HTML-part-PNG solutions are all "worse". ;) In fact, the MOS for math articles on the English Wikipedia does encourage HTML for inline math because of the issues with PNG rendering. Nageh 19:02, 22 July 2011 (UTC)

IMO semantics at source code level is more important than semantics of the HTML code after rendering. It is easy to fix flawed rendering of TeX-Code by replacing the current renderer with a better one, once one is available. But it might turn out to be pretty hard to automatically recover the semantics (in mathematical means) from HTML code - which could become necessary if someone wants to port Wikipedia to a different platform (i.e. non-browser), improve the PDF renderer, etc.

I understand what you mean, and I agree. Obviously, there is a trade-off between nicer HTML rendering and future-proof TeX code. But note that we've got the {{math}} template, which can retain (at least) part of the semantics you desire. Of course, this would assume its consistent use... which is not the case. :/

2<sup>x</sup> is not how you'd write it in html. You would write 2<sup>''x''</sup>, with the x italicized. See http://en.wikipedia.org/wiki/Wikipedia:MOSMATH. Michael Hardy 18:48, 24 July 2011 (UTC)

## mathJax

11

I have now notified the mathJax software developers of the existence of this present page and the one at this URL. I'd really like to see mathJax brought up to the state in which it can make sense to just force everyone who reads Wikipedia math articles to use mathJax. This page elicited some support for that. But it's a page for people discussing how to edit, develop, and maintain Wikipedia's math articles, not for software developers who can actually do something about the software. The software developers (both those who read pages like the present one and those who work on mathJax) need to know what the needs are from the point of view of the people who post to those pages. Robert Miner at mathJax.org, who is interested in supporting Wikipedia, now knows about the following pages, and so should everyone here:

This post was posted by He7d3r, but signed as Michael Hardy.

On 25 June 2011, David Eppstein wrote in this discussion:

I would very much like to see mathjax become standard for Wikipedia math formatting, so that no special user-preference tweaking is required; it works well on the other sites I've used that use it (e.g. mathoverflow and mathscinet) and looks a lot better a lot more consistently than the alternatives.

It appears to me that would solve the problems everyone's been griping and arguing about on that discussion page since 2003, provided some bugs can be fixed. Bugs are listed at w:en:User_talk:Nageh/mathJax. Those need to get fixed by software people at mathJax.org, who are now aware of that page and are interested in Wikipedia's adopting David Eppstein's suggestion.

The other thing that would need to get done would be done by those who edit the software that Wikipedia uses.

This post was posted by He7d3r, but signed as Michael Hardy.

Just to make it clear: The mathJax user script currently relies on the "Leave it as TeX" rendering option. It would be easy to change this so the alt tag of the PNG output (setting "Always render PNG") would be processed... but it seemed a bit pointless to show images for fractions of a second just before they are replaced by MathJax output. This could be used as a fall-back for users that don't have JavaScript activated though I would suggest a more intelligent solution.

Now I remember: I had to choose "leave it as TeX" and also install a file called vector.js or something like that, which I copied from somewhere. So apparently if the "leave it as TeX" option isn't there, then mathJax won't work for me for now. Michael Hardy 17:08, 22 July 2011 (UTC)

Re I'd really like to see mathJax brought up to the state in which it can make sense to just force everyone who reads Wikipedia math articles to use mathJax: please don't force people to use a particular technology: that goes against many principles underlying the web. By all means let's have it as a default once the bugs are ironed out, but I don't see a strong reason to remove the PNG and HTML options entirely; there will always be a few oddballs who want those things for one reason or another.

Everyone is "forced" to use certain technologies when they read any Wikipedia page or any web page. Even when there are several options and they can choose one as their preference from a menu, they're "forced" to use only those available in the menu. (But maybe making it the default is the right way to go.) Michael Hardy 23:48, 25 July 2011 (UTC)

Can't we enable MathJax as a "beta" option, disabled by default? Is that a possibility?

This post was posted by Markovnikov~mediawikiwiki, but signed as Markovnikov.

So folks know: while one option is to have MathJax parse tex expressions and display them, it's also possible to produce mathml on the server side and have MathJax display it in the browser, which addresses the problem of browsers that can't display mathml themselves, while avoiding any quirks of MathJax's tex parsing.

This post was posted by Lee Worden, but signed as Wonder.

I would think the bigger quirks are with the MathML rendering as implemented in browsers. Even Firefox's implementation, which is supposedly the most complete implementation among browser, is missing essential functionality like negative spaces. As such, using MathJax to directly parse from the TeX source is still the better option.

Nageh - parsing TeX to MathML on the server and using MathJax to display the MathML, instead of the browser's MathML implementation, would also address the quirks issue to the same degree as your proposal, and might be more efficient because the MathML could be cached for reuse.

This post was posted by Lee Worden, but signed as Wonder.

I see what you mean. That is certainly a feasible solution as well, and would definitely render faster than typesetting from TeX.