Requests for comment/HTML templating library/Knockoff - Tassembly

During the discussion at the Architecture summit DOM-based templating received a good amount of support for its security features including automatic sanitization of attribute values (href, src, style) to avoid XSS issues. In particular KnockOut.js as a client-side solution was positively mentioned by several participants who had used it in previous projects. KnockoutJS is a purely client-side JS templating solution, which reactively reflects updates in a model. It is relatively light-weight (16k gzipped) and fairly extensible.

The question that we tried to answer after the summit was what it would take to provide an efficient server-side implementation of KnockoutJS in both JS and PHP. To establish a baseline, we ran KnockoutJS natively on node.js using a pure-JS DOM implementation (JSDOM). As expected, performance was not great. Heavy use of the pure-JS DOM and unnecessary reactivity resulted in performance close to that of PHP templating libraries.

As a next step, we designed a very simple JSON-based intermediate representation that captures the basics of templating while still supporting security properties like balancing of tags and context-sensitive attribute escaping:

The basic JavaScript code to evaluate this to an HTML string is only about 260 lines of code or 1.4kb gzipped and is the fastest templating library in our tests. At the same time, proper nesting and attribute sanitization are statically guaranteed by the DOM-based front-end compiler and the runtime library (attribute escaping). Much of the performance comes from doing the heavy lifting of DOM parsing only once, and persisting the result as a simple JSON IR.

We also implemented a compiler for KnockoutJS templates to this JSON intermediate format in 258 lines. Additionally, C. Scott implemented an alternate compiler from spacebars (a Handlebars-inspired DOM templating library in Meteor) to the same IR.

This exercise answered several questions:


 * 1) we can have a very efficient, but yet secure template library,
 * 2) the implementation does not require a lot of code and is easy to port to PHP, and
 * 3) we can base this on an established client-side templating system (KnockoutJS) that is immediately usable for reactive applications on the client.

Template Syntax Examples
A short tour through the KnockoutJS documentation:

The most basic of things
We will support auto escaping of the variables based on what context they're being put in. In this case we would escape someVariable in an href context, and anotherVariable in an html context.

Template inclusions
Although not shown, you can also have anonymous templates that are declared in the template and then referenced by element id.

Conditionals
There is also an if binding; but for server side rendering we should prefer visible so that the template does not get destroyed if the client side will do additional dynamic work.

Note: We're thinking about allowing simple expressions involving, but for now in our prototypes we've explicitly disabled it. Future work :)

Looping
This will render the outer  in any case. To hide this, we can add the visible binding:

Filters
Knockout supports dot notation on varibles. We do not wish to support that because it allows arbitrary javascript to be executed. Instead we propose implementing filters that can be defined in the controller. That syntax would look something like:

Why Knockout.JS and not Handlebars
We proved that we can have a fast, relatively small, and secure template library that allows dynamic client side behavior that goes further than basic handlebars based on something that is already available and in general use.

Knockout provides advantages over dynamic handlebars derivatives because it is DOM based, this gives us several advantages
 * It is already valid HTML so can be edited in VisualEditor and displayed on the page without javascript
 * Compilation to the intermediate string format is more efficient because the tag structure is already there and can be guaranteed to be balanced. Additionally, we lose no metadata in the process.

Longer-term architecture
+--+            +--+    +-+ +--+             +--+    +-+      |                          |                  |      | (immediate)              \/                \/ | Deliver via           ++ | ResourceLoader        | Render               | |                       | Server or Client Side  | \/                      ++    +-+                  | Raw HTML   | Knockout Compatible HTML | Knockout.JS |                 |            | Can be injected directly to page +-+                 |            | via innerHTML |                             |           \/      | Dynamically                  |        +-+ | controls                    |        | Knockout.JS | |                             |        +-+      |                              |            | Dynamically controls |                             \/          \/      |                            +--+      +--->| Client Side Page | +--+
 * Knockout | Compiled to | JSON        |    | Data Model  |
 * File  | --> | Intermediate |    | on Render |

Server-side pre-rendering
KnockoutJS's attribute syntax combined with server- and client-side implementations makes it relatively straightforward to pre-expand templates on the server while maintaining the ability to dynamically update the rendering on the client. This can improve initial page load performance and lets us support users without JavaScript.

MediaWiki messages in JSON IR
A lot of our user interface is defined by localized messages. These are using wikitext syntax, which is relatively expensive and complex to parse both on the server and the client. With Parsoid we can compile these messages to HTML+RDFa, and then turn that to the same JSON IR. With a common low-level representation we can potentially cut down the need for different client-side libraries, and speed up the message rendering on both the server and the client.

In the very long term some of the experience gained in this area might even become useful for HTML templating in content.

Next Steps

 * (optionally) start using KnockoutJS on the client
 * port the JS implementation to PHP
 * benchmark using both Zend and HHVM
 * hook up existing attribute sanitization code in both PHP & JS
 * add features:,  ,   (MediaWiki messages),   (partials)
 * template delivery via ResourceLoader
 * implement filters