Requests for comment/Graph

From mediawiki.org
Request for comment (RFC)
Graph
Component General
Creation date
Author(s) Yurik (WMF)
Document status implemented
See Phabricator.

Implemented, needs update[edit]

This proposal should be moved to the Graph/graphoid documentation page.

Background[edit]

Graph extension is already live, but needs some improvements to be deployed on a massive scale for Wikipedia. Current implementation encodes the graph definition (JSON) as an HTML data attribute <div class="mw-wiki-graph" data-spec="{...}" />, and a client-side javascript library parses that big JSON blob and renders image on a canvas sub-div element using Vega library. Graph can be defined with an embedded <graph> tag, or can take up an entire Graph namespace page. Graph definition could contain data, or it may reference external data (by URL, WMF domains only)

Problems & Proposals[edit]

Older browser support[edit]

We need to show a PNG file instead of doing the complex client-side rendering whenever the user's browser is not supported or the network speeds are low. Vega supports rendering PNGs server-side in a headless mode. The question is how to get graph definition to the node.js rendering service. The definition could be large to be included in the image URL as a parameter.

Proposal:

The graph tag extension / content handler would output a no-script <img> tag in addition to the <div> tag, with the PNG URL going either to the api, special page, or directly to node.js service, e.g.

<div class="mw-wiki-graph" data-spec="{...}" id="<graphDataHashId>" />
<noscript>
  <img href="//api.wikimedia.org/v1/<site>/pages/<title>/graph/<revisionId>/<graphDataHashId>.png" />
</noscript>

graphDataHashId is a sha1 hash of the graph definition data after template expansion. In case the client decides to use the <noscript>, either because there is no JS support, or because the browser is too old, <canvas> tag will not be added to the mw-wiki-graph <div>, and the <noscript> will be inverted to become a regular <img> tag.

In case multiple identical graphs were used on a page, the id attribute will only be added to the first <div> element.

The backend image rendering service will need to re-retrieve the same HTML page, and DOM-parse it to get the data-spec attribute from the <div> element with the right ID.

At first, the page will be retrieved via http://<site>/w/index.php?title=<title>&oldid=<revisionId>, but later we may use Gabriel's data storage.

Limitations:

  • This approach will only work on saved pages, not during "preview". Preview mode will only work with client-side JavaScript enabled.
  • We will always be using &oldid=<revisionId> parameter, thus fragmenting cache

Template Expansion[edit]

In order to be trully useful, Graph must support template expansions, e.g. a world map template could take a list of country codes to highlight. On the other hand, JSON itself could contain wiki-like elements, e.g. [[123]] (a numeric element in a list of lists). Those elements perfectly legal in JSON, but should not be interpreted as Wiki markup.

The simplest way around it would be to ignore Wiki markup expansion, and instead perform our own expansion with the limited syntax. I propose graph extension to manually expand anything enclosed in the tripple curly braces, e.g. {{{1}}} or {{{string}}}, since that is clearly invalid JSON, but ignore all other valid Wiki markup. If we can get parser to support this mode, rely on the parser, otherwise do regex search/replace, without any expression or other support.

Note that in many cases JSON will be invalid when saving, and will become valid only after template expansion.