Requests for comment/HTML templating library/Knockoff - Tassembly

From mediawiki.org
Jump to navigation Jump to search

During the discussion at the Architecture summit DOM-based templating received a good amount of support for its security features including automatic sanitization of attribute values (href, src, style) to avoid XSS issues. As a simple example, a user-supplied string javascript:alert(cookie) is safe in text content, but unsafe in an href attribute. Purely string-based templating libraries like Mustache, Handlebars or Twig can't support systematic attribute sanitization as they have no understanding of the HTML structure they are constructing.

KnockOut.js as a client-side solution was positively mentioned by several participants who had used it in previous projects. A special feature of KnockoutJS is its supports for reactive updates of the view after changes in the model. It is relatively light-weight (16k gzipped) and fairly extensible.[1]

After the summit, we tried to figure out what it would take to provide an efficient server-side implementation of KnockoutJS in both JS and PHP. To establish a baseline, we ran KnockoutJS natively on node.js using a pure-JS DOM implementation (JSDOM). As expected, performance was not great. Heavy use of the pure-JS DOM and unnecessary reactivity resulted in performance close to that of PHP templating libraries.

As a next step, we designed a very simple JSON-based intermediate representation that captures the basics of templating while still supporting security properties like balancing of tags and context-sensitive attribute escaping:

["<div",["attr",{"id":"id"}],">",
    ["foreach",{
        "data":"links",
        "tpl":["<a",["attr",{"href":"url"}],">",["text","description"],"</a>"]
    }],
"</div>"])

The JavaScript code to evaluate TAssembly to an HTML string is very compact and weighs in at 3.2k gzipped. It is also the fastest templating library in our tests. We also implemented a compiler for KnockoutJS templates to this JSON intermediate format. Proper nesting and attribute sanitization are statically guaranteed by the DOM-based front-end compiler and the runtime library (attribute escaping). Much of the performance comes from doing the heavy lifting of DOM parsing only once, and persisting the result as a simple JSON IR.

This exercise answered several questions:

  1. we can have a very efficient, but yet secure template library,
  2. the implementation does not require a lot of code and is easy to port to PHP, and
  3. we can share templates with an established client-side templating system (KnockoutJS) that provides rich reactive functionality where desired.

Template Syntax Examples[edit]

A short tour through the KnockOff documentation, largely based on the KnockoutJS documentation:

The most basic of things[edit]

<a data-bind="attr: {href: someVariable}, text: anotherVariable"> </a>

We will support auto escaping of the variables based on what context they're being put in. In this case we would escape someVariable in an href context, and anotherVariable in an html context.

Template inclusions[edit]

<div data-bind="template: { name: "resource.loader.module.FancyTemplateName", data: someVariable } />

Although not shown, you can also have anonymous templates that are declared in the template and then referenced by element id.

Using MediaWiki's i18n Framework[edit]

<div data-bind="message: { name: "message-name", data: someVariable } />

Conditionals[edit]

<div data-bind="visible: shouldShowMessage">
    You will see this message only when "shouldShowMessage" holds a true value.
</div>

There is also an if binding; but for server side rendering we should prefer visible so that the template does not get destroyed if the client side will do additional dynamic work.

<div data-bind="if: displayMessage">Here is a message. Astonishing.</div>

Note: We're thinking about allowing simple expressions involving < > + - || &&, but for now in our prototypes we've explicitly disabled it. Future work :)

Looping[edit]

<ul data-bind="foreach: itemArray">
    <li data-bind="text: $data"> </li>
</ul>

This will render the outer <ul> in any case. To hide this, we can add the visible binding:

<ul data-bind="visible: people, foreach: people">
    <li>
        <span data-bind="text: $index"></span>:
        <a data-bind="attr: url, text: name">Linked name</a>
    </li>
</ul>

Context renaming[edit]

Narrows and unpacks a scope.

<p data-bind="with: coords">
    Latitude: <span data-bind="text: latitude"> </span>,
    Longitude: <span data-bind="text: longitude"> </span>
</p>

Functions[edit]

The global object $ is available in all scopes, and can be used to register functions working on data. This includes data massaging, predicates, and dynamic data access.

<h1 data-bind="text: $.toUpper( $someOtherFunction( { someParam: someVar } ) )"> </h1>

See the KnockOff and TAssembly documentation for the details.

Implementations[edit]

Performance[edit]

All times in seconds (best of 50)
Engine Test1 Test1b Test2 Test3
MediaWiki [PHP 5.3.10] 1.659 1.715 17.116 7.617
MediaWiki [HHVM] 0.471 0.513 4.091 1.981
Twig (string) [PHP 5.3.10] 1.577 1.459 6.200 3.709
Twig (string) [HHVM] 0.476 0.509 1.094 0.657
Twig (file, no cache) [PHP 5.3.10] 1.715 1.739* 6.055* 3.716*
Twig (file, no cache) [HHVM] 0.787 0.772 1.070 0.834
Twig (file, cached) [PHP 5.3.10] 1.664 1.759 6.029 3.656
Twig (file, cached) [HHVM] 0.422 0.447 0.736 0.491
Mustache [PHP 5.3.10] 1.821 1.889 11.455 / 24.120 (using lambda) 3.720
Mustache [HHVM] 0.547 0.496 1.282 / 3.841 (using lambda) 0.664
Lightncandy Handlebars [PHP] 0.542 0.606 7.183 / 135.342 (using lambda) 3.923
Lightncandy Handlebars [HHVM] 0.184 0.200 0.847 / 1.420 (using lambda) 0.540
Mustache [Node.js] 0.339 0.393 2.248 / 3.906 (using lambda) 0.794
Handlebars [Node.js] 0.119 0.160 0.725 / 1.245 (using lambda) 0.185
Knockoff / TAssembly [Node.js] 0.047 0.070 0.591 / 0.451 (using lambda) 0.181

Next Steps[edit]

Knockout.JS is immediately ready to be used as a reactive templating library on the client.

KnockOff is ready for use as a non-reactive templating library (node / client, PHP ).

We have identified the following areas that can be improved for integration in the MediaWiki environment:

  • MediaWiki message integration ($.i18n, $.plural, $.gender)
  • ResourceLoader hooks for loading of partial templates (template)
  • Message bundle creation using static analysis

UI templating opportunities[edit]

Server-side pre-rendering[edit]

KnockoutJS's attribute syntax combined with server- and client-side implementations makes it relatively straightforward to pre-expand templates on the server while maintaining the ability to dynamically update the rendering on the client. This can improve initial page load performance and lets us support users without JavaScript. [2]

High-level UI templating data flow[edit]

+----------+             +--------------+    +-------------+
| Knockout | Compiled to | TAssembly    |    | Data Model  |
| template | ----------> | JSON IR      |    | on Render() |
+----------+             +--------------+    +-------------+
      |                          |                  |
      |                          \/                \/
      | Deliver via            +------------------------+
      | ResourceLoader         | TAssembly render(model)|
      |                        | Server or Client Side  |
      \/                       +------------------------+
    +-------------+                  | Raw HTML   | Knockout Compatible HTML
    | Knockout.JS |                  |            | Can be injected directly to page
    +-------------+                  |            | via innerHTML
      |                              |           \/
      | Dynamically                  |        +-------------+
      | controls                     |        | Knockout.JS |
      |                              |        +-------------+
      |                              |            | Dynamically controls
      |                              \/          \/
      |                            +------------------+
      +--------------------------->| Client Side Page |
                                   +------------------+

Longer-term architecture[edit]

MediaWiki messages in TAssembly[edit]

A lot of our user interface is defined by localized messages. These are using wikitext syntax, which is relatively expensive and complex to parse both on the server and the client. With Parsoid we can compile these messages to HTML+RDFa, and then turn that to the same JSON IR. With a common low-level representation we can potentially cut down the need for different client-side libraries, and speed up the message rendering on both the server and the client.

Content templating[edit]

The next logical step will be to leverage KnockOff & TAssembly for content templating functionality very similar to our traditional wikitext templates. Likely advantages include better performance, a better separation of data from presentation, and a simplification of our architecture by eliminating the need for wikitext parsing in the normal rendering pipeline. See the separate design sketch for details.

See also[edit]

Notes[edit]

  1. We considered several other DOM-based libraries including AngularJS and Distal. Of these AngularJS is the most interesting alternative choice for its reactivity and feature set. The disadvantage is its size (36k gzipped) and implementation complexity, which is why we decided to go with KnockoutJS.
  2. Issue: destructive DOM updates
    Constructs like foreach and if with a false condition remove or replace parts of the original template. To give knockout still access to the original template content, a preprocessing pass can extract a server-side backup of the template from an attribute, and pass this into knockout as an anonymous template.