Requests for comment/HTML templating library/Knockoff - Tassembly

During the discussion at the Architecture summit DOM-based templating received a good amount of support for its security features including automatic sanitization of attribute values (href, src, style) to avoid XSS issues. As a simple example, a user-supplied string  is safe in text content, but unsafe in an href attribute. Purely string-based templating libraries like Mustache, Handlebars or Twig can't support systematic attribute sanitization as they have no understanding of the HTML structure they are constructing.

KnockOut.js as a client-side solution was positively mentioned by several participants who had used it in previous projects. A special feature of KnockoutJS is its supports for reactive updates of the view after changes in the model. It is relatively light-weight (16k gzipped) and fairly extensible.

After the summit, we tried to figure out what it would take to provide an efficient server-side implementation of KnockoutJS in both JS and PHP. To establish a baseline, we ran KnockoutJS natively on node.js using a pure-JS DOM implementation (JSDOM). As expected, performance was not great. Heavy use of the pure-JS DOM and unnecessary reactivity resulted in performance close to that of PHP templating libraries.

As a next step, we designed a very simple JSON-based intermediate representation that captures the basics of templating while still supporting security properties like balancing of tags and context-sensitive attribute escaping:

The basic JavaScript code to evaluate this to an HTML string is only about 260 lines of code or 1.4kb gzipped and is the fastest templating library in our tests. We also implemented a compiler for KnockoutJS templates to this JSON intermediate format in 258 lines. Proper nesting and attribute sanitization are statically guaranteed by the DOM-based front-end compiler and the runtime library (attribute escaping). Much of the performance comes from doing the heavy lifting of DOM parsing only once, and persisting the result as a simple JSON IR.

This exercise answered several questions:


 * 1) we can have a very efficient, but yet secure template library,
 * 2) the implementation does not require a lot of code and is easy to port to PHP, and
 * 3) we can base this on an established client-side templating system (KnockoutJS) that is immediately usable for reactive applications on the client.

Template Syntax Examples
A short tour through the KnockoutJS documentation:

The most basic of things
We will support auto escaping of the variables based on what context they're being put in. In this case we would escape someVariable in an href context, and anotherVariable in an html context.

Template inclusions
Although not shown, you can also have anonymous templates that are declared in the template and then referenced by element id.

Conditionals
There is also an if binding; but for server side rendering we should prefer visible so that the template does not get destroyed if the client side will do additional dynamic work.

Note: We're thinking about allowing simple expressions involving, but for now in our prototypes we've explicitly disabled it. Future work :)

Looping
This will render the outer  in any case. To hide this, we can add the visible binding:

Data Model
A data model needs to be supplied to the render function, this is basially just an array of strings or blessed objects. By blessing objects as safeAs we can explicitly declare things as already escaped for certain contexts.

Longer-term architecture
+--+            +--+    +-+ +--+             +--+    +-+      |                          |                  |      |                          \/                \/      | Deliver via            ++ | ResourceLoader        | Render               | |                       | Server or Client Side  | \/                      ++    +-+                  | Raw HTML   | Knockout Compatible HTML | Knockout.JS |                 |            | Can be injected directly to page +-+                 |            | via innerHTML |                             |           \/      | Dynamically                  |        +-+ | controls                    |        | Knockout.JS | |                             |        +-+      |                              |            | Dynamically controls |                             \/          \/      |                            +--+      +--->| Client Side Page | +--+
 * Knockout | Compiled to | JSON        |    | Data Model  |
 * File  | --> | Intermediate |    | on Render |

Server-side pre-rendering
KnockoutJS's attribute syntax combined with server- and client-side implementations makes it relatively straightforward to pre-expand templates on the server while maintaining the ability to dynamically update the rendering on the client. This can improve initial page load performance and lets us support users without JavaScript.

MediaWiki messages in JSON IR
A lot of our user interface is defined by localized messages. These are using wikitext syntax, which is relatively expensive and complex to parse both on the server and the client. With Parsoid we can compile these messages to HTML+RDFa, and then turn that to the same JSON IR. With a common low-level representation we can potentially cut down the need for different client-side libraries, and speed up the message rendering on both the server and the client.

In the very long term some of the experience gained in this area might even become useful for HTML templating in content.

Next Steps
Knockout.JS is immediately ready to be used as a reactive templating library on the client.

We have identified the following areas that can be improved for integration in the MediaWiki environment:


 * Hook up attribute sanitization (already exists in Parsoid)
 * MediaWiki message integration
 * ResourceLoader hooks for loading of partial templates

Server side work
The server-side implementation (Knockout to IR compiler and IR runtime) is in an early implementation stage, but already usable. These are the next implementation steps to make it production-ready:


 * port the JS implementation to PHP
 * benchmark using both Zend and HHVM
 * hook up existing attribute sanitization code in both PHP & JS
 * add features:,   (MediaWiki messages),   (partials)
 * template delivery via ResourceLoader

Context renaming
This syntactic sugar is a feature natively supported in Knockout, but not in the current server side implementation.

Filters
There is often a need for simple formatting tasks that are not desirable to preformat in the data model. Examples include date and number formatting and string munging. Knockout can support this via functions defined in the model. It is preferable however to implement a helper syntax that is declarative and portable between implementations.

Implementations

 * Knockoff implementation (node.js)
 * TAssembly implementation (node.js)
 * TAssembly implementation (PHP)