Requests for comment/HTML content templating

MediaWiki has very rich Wikitext-based templating and scripting facilities. We are interested in exploring how we can provide similar functionality in a HTML-only world.

Why are we interested in storing and processing our content as HTML rather than wikitext? There are several, but the most important are probably performance and ease of use.

Performance of page views can be better when using stored HTML, as no expensive transformations are necessary to produce the desired HTML output. Similarly, performance of templating operations can be better when directly building up an HTML string rather than building up a wikitext string first.

HTML with semantic markup is significantly easier to work with than wikitext or presentational HTML without a well-defined interface. This lets us easily extract information from our content, build new ways to interact with it, or modify the way information is presented on different platforms. We can do so using standard tools rather than building and maintaining our own. We won't rely on unmaintained hacks like tidy any more to balance our tags. Our visual editing can finally be truly WYSIWYG, rather than something that's only WYSIWYG as long as templates emit balanced content.

Finally, there is the potential to simplify our infrastructure. Wikitext parsing abilities and thus Parsoid become an optional editing feature. Parsoid effectively provides an optional wikitext editor for HTML content, rather than the current use as a HTML front-end for wikitext content.

Goals

 * Provide similar functionality as available in Wikitext templating with parser functions & Scribunto
 * Separate presentation (templates) from data & logic as much as possible
 * Generate well-formed HTML, support efficient and principled sanitization of both templates and data
 * Perform better than the Wikitext pipeline
 * Integrate well with HTML-only content storage
 * Ideally, let top-level transclusions of HTML templates be represented as wikitext transclusions

Presentation layer: Templating
Initial implementations: TAssembly.js, TAssembly.php, Knockoff.js
 * DOM-based compiler, current front-end syntax uses KnockOut.js syntax
 * HTML syntax can support visual template editing
 * Optional DOM-based template sanitization for user-editable templates
 * Intermediate string-based TAssembly representation with very fast execution engines
 * Automatic context-sensitive sanitization of model data used in attributes to prevent XSS, without a need for a separate DOM post-processing step (good for performance)

Data layer
The templating environment needs a way to access information from several sources: Apart from parameters, this information can't all be set up ahead of time in a traditional model. Instead, it needs to be available to the template on demand, in a 'pull' model.
 * Transclusion parameters
 * Wikidata queries
 * Other environmental information such as the one currently provided by parser functions and magic words

Utilities for logic, data access & massaging
Complex data access, -massaging and predicate logic should be clearly separated from the templating layer. It would be desirable to be able to execute these helpers on both the server and client, which would favor using JavaScript for this task. There might however also be advantages in leveraging the existing Lua code for some of this functionality.

I18n, L12n
Internationalization & localization involves both data access (message loading) and utilities (gender, plural etc). Messages can be represented using regular content templates, so that they can be efficiently executed by the same runtime.