Template Object Model

Lets flesh out a basic HTML DOM templating interface here.

Requirements

 * Both a PHP and JS implementation
 * Resource loader needs to support loading templates
 * thoughts on speculative loading
 * also elements in HTML5
 * tendency to move towards JS for client side (and later server-side) templating
 * if/else
 * array handling
 * foreach
 * detect an empty array
 * Template partials / transclusion
 * Ability to control HTML escaping (preferably with escaped output as the default)
 * Should be obvious to see what is escaped by looking at the template
 * Context sensitive escaping (attributes)
 * Extensible enough to accommodate MediaWiki i18n system
 * Should either be a custom implementation of the library or a predefined function/filter
 * Ensure well-balanced DOM throughout
 * Client-side library that has a small footprint (~20K or less)
 * should support visual editing (so no magic syntax in templates, only HTML, simplicity)

Interesting implementations

 * Distal -- Zope TAL in JS
 * Genshi -- cleaner than TAL, very close to what we are looking for, but Python implementation rather than JS
 * AngularJS -- attribute-based HTML templates compiled to code, reactive data binding with separate model, fully featured including sanitization etc
 * KnockoutJS -- single attribute-based (TAL-like) templating, reactive data binding
 * XHP -- FB XML literals in PHP; slow in Wikia testing, PHP only.
 * React -- pseudo DOM based, reactive data binding
 * Mustache -- Popular and simple text templating with minimalistic brace syntax. Not DOM, so no attribute escaping, nesting guarantees etc.

Spec strawman
In uncompiled templates, we can use user-friendly and compact syntax similar to mustache :

Transclusion can be defined with this syntax:

This calls the template 'link' via a possible 'link' controller. The 'link' template:

For visual editing, this can be de-sugared to a pure XHTML+RDFa template by replacing the brace syntax with elements and attributes suitable for visual editing:

Expressions

 * data references in dot and [] notation:  and   or
 * iteration in : ,   or
 * arithmetic:
 * (in)equality predicates:
 * logical negation/and/or, parenthesis:

We considered also supporting function calls, but decided to go with data references only for now to keep porting straightforward, and maintain a clear separation between formatting options and data. If necessary, such support can be added at a later point.

Possible things to add to expressions
Things that could be useful in expressions as they are a property of a view rather than the model. This is mainly about things that can be used in conditionals and iteration specifications. Localized formatting (numbers, uppercase, plural etc) can be handled with properties. We don't implement these straight away to keep porting simple, but could consider adding these features later.
 * length of arrays, maybe  or   -- to print the length, change layout depending on length
 * slicing on arrays -- Python syntax?  -- to split up a long array in the view or only display the top n entries
 * calls of functions in the controller: can be used to implement more complex predicates, sorting etc
 * plain data access maps to the model, functions to the controller?
 * more math, as in the JS Math module contents?
 * could also be left to the controller

Things probably better implemented in the controller are
 * default sorting: complex, JS can re-sort client-side
 * should have access to locale in scope

Value formatting
Via attributes on the  element. I18n behavior inherited by DOM scope.


 * Strings
 * maxlength
 * case= upper/lower
 * Numbers
 * fmt format string (precision, padding etc)

Escaping
The conservative default assumption is that any data passed into the template was user-supplied and needs to be escaped according to the DOM context:


 * In text content, everything is escaped to text
 * In attributes, the correct sanitization for the tag and attribute is applied after full expansion on the complete attribute value.

This safe default setting means that any user input is sanitized correctly for the context it is used in. There is however also occasionally a need to pass in data that is either already sanitized or would not pass user input sanitization for a good reason (JS links in the UI for example). This can be declared centrally in a 'bless' method in the controller which blesses exceptions for a specific context.

Escaping is dependent on the context of its use in the template. In the following example, the html passed into the title attribute will be transformed to plain text, but the html expanded in the text content will remain HTML as it was blessed for this context.

In content templating, no blessing is performed by the system.

Higher-level widgets
Basically like extensions.


 * Declarative with freedom of implementation
 * Rich support in VE possible
 * Drop-downs of options
 * Data type driven formatting (date/birthday formatted as localized age)
 * In-place editing of the underlying data within the rendering
 * Table widget alternative to data tables with table start / row / end templates
 * Both table and plot could come in handy in special pages

Similarity to AngularJS
The syntax here is very similar to that used by AngularJS. The AngularJS implementation includes
 * context-sensitive escaping very close to what we describe above, HTML sanitization
 * formatting filters
 * generic directive registry with choice between custom attributes and tags; tags make it possible to transparently add validation to simple html tags like
 * localization incl. pluralization
 * expression parser that strikes a good compromise between flexibility and portability

It also provides something we don't need on the server side:
 * bi-directional data binding with dynamic updates
 * client-side form validation
 * animations
 * routing and scrolling for single-page applications

A problem with AngularJS as a simple client-side templating library is its size. It minifies+gzip-compresses to 29-38k rather than the ~3k for a simple string template library like Mustache. It does provide much more functionality of course, but that does not help much if that functionality is not needed. Bi-directional data bindings are also not always needed. There are one-shot alternatives for most directives that avoid the overhead of setting up watchers on a data model.

It might be possible to build a smaller version of AngularJS. Another option is to build a static template library that uses the same syntax / attributes and some of the code (expression parser, sanitization stuff).