Requests for comment/HTML templating library

Currently several MediaWiki extensions make use of JavaScript or PHP HTML templating libraries. It would be ideal to standardize on one library and add it into MediaWiki core. This is similar to the situation that existed regarding CSS languages before LESS support was added to core and made the standard.

Justification

 * Better code readability, editability, and portability
 * Separate code from markup
 * Avoid jQuery spaghetti
 * Avoid loading multiple JS libraries from different extensions
 * Make it easier to build complex and good looking admin interfaces

Existing implementations in MediaWiki extensions

 * PageTriage uses Underscore.js
 * MobileFrontend uses Hogan.js (Twitter's implementation of Mustache.js)
 * DonationInterface uses a custom template system (RapidHTML)
 * ArticleCreationHelp uses a custom template system
 * Wikibase uses a custom template system - it allows templates to be used with both php and javascript, which is a requirement

Templating library options and considerations
It would be best if we could choose a library that has both JS and PHP implementations. That will make it easier to share/port code between client-side and server-side, and eliminate the need to learn two different syntaxes. It should also be a library that is lightweight enough to use on mobile, but flexible enough to meet the needs of diverse applications.

Requirements

 * Both a PHP and JS implementation
 * if/else
 * array handling
 * foreach
 * detect an empty array
 * Secure by default
 * Attribute sanitization (no XSS in href, src, style etc)
 * HTML escaping and DOM balancing
 * Escape hatch for things that are known to be pre-escaped, but this should be easy to audit, ideally by checking a single place. Ideally it is not necessary to read all templates for this.
 * Extensible enough to accommodate MediaWiki i18n system
 * Should either be a custom implementation of the library or a predefined function/filter
 * No huge client-side footprint
 * Dumb (the less logic the better). Some template languages, for example jinja2 have a concept of filters where computation happens in the template itself. In my opinion this is unnecessary and can lead to complicated unreadable templates. Data should be preprocessed in PHP before being passed to a template.
 * The opposite position is that simple logical and arithmetic expressions in conditions and formatting filters can save a lot of very template-specific code in the controller. With some very basic added functionality, many simple templating tasks can actually be performed just with a template and the model. Filters can be a way to implement i18n, currency formatting or pluralization, which are arguably templating task. The power of expressions and filters should however be limited to the bare minimum to preserve portability and avoid unnecessary complexity.
 * Readable - templates should be easy to grasp for someone with basic HTML knowledge. Ideally they are just plain HTML with attributes, so that they can be displayed and edited with HTML tools (think VE). This can enable server-side pre-expansion followed by client-side updates.
 * Commenting

Performance
MaxSem did some profiling of both Twig and Mustache on the server side (ported an existing special page to it) and compared the performance with using regular MediaWiki HTML generation. According to Max, the performance characteristics were very similar for all three. For example: Mustache's results have approximately the same unnoticeable difference from original code.
 * Original code 5710ms 50th percentile, 5741ms 90th percentile
 * Twig, uncached: 5693, 5738
 * Twig, cached: 5700, 5741

CSteipp wrote a set of scripts to compare different scenarios, and timed the results on fenari (using the runall.sh script). The test was run using MediaWiki's Html::element, Twig's string loader, Twig's file loader (with and without caching enabled), and Mustache's php engine. Improvements to the tests welcome if there is a better way to code comparable tests across the platforms.
 * Tests:
 * Test1 - Output a with the same id attribute and body 100,000 times
 * MediaWiki: Html::element( 'div', array( 'id' => $vars['id'] ), $vars['body'] );
 * Twig: $twig->render('  ', $vars );
 * Mustache: $html = $engine->render('  ', $vars );
 * Test1b - Output a with a different id attribute, but the same body 100,000 times
 * Test2 - Output a div for each element of a 1,000 element associative array, 1,000 times updating an element of the array on each itteration.
 * MediaWiki: foreach ( $vars['items'] as $key => $item ) { $body .= Html::element( 'div', array( 'id' => $key ), $item ); }
 * Twig: $html = $twig->render(' {% for key, item in items %}  {% endfor %} ', $vars );
 * Test3 - Output a div for each element of a 1,000 element array, where each element has a different value, and update an element on each itteration.
 * MediaWiki: foreach ( $vars['items'] as $item ) { $body .= Html::element( 'p', array, $item ); }
 * Twig: $html = $twig->render(' {% for item in items %} {% endfor %}  ', $vars );
 * Mustache: $html = $engine->render('    ', $vars );
 * Results:

Gabriel Wicke also ran the PHP benchmarks on HHVM and added benchmarked mustache on nodejs, all on ruthenium:

Security
Twig, and a lot of other template engines, use the file system for caching compiled templates, resulting in a possible attack vector that Chris and ops wouldn't like (but not necessarily prohibit entirely): cache is just PHP files that get executed on page views and potentially by maintenance scripts too, so having Apache write something executable is a bit icky. (by default, Twig does not cache )
 * It looks like there are hacks to put it into memcached too, or scripts to pre-compile them all like TemplateCacheCacheWarmer.

My (CSteipp) major concern on security is how the template engine encourages developers to code securely (or makes coding securely easy), and how difficult it is to review an application for security that uses the engine.
 * Twig uses something close to htmlentities on all strings by default (notably, "'" => "& #039;")
 * Mustache uses htmlspecialchars, so ' is not escaped (so by policy, attributes can never be quoted with single quotes, which is what we have currently in MediaWiki's templating).
 * Neither Twig nor Mustache have context-aware escaping (to avoid XSS attacks via attributes etc), which MediaWiki's current templating does provide. They also don't balance the DOM, so broken user-supplied HTML or templates can have very non-local effects.

Implementation
On the client-side, the library would be packaged into a ResourceLoader module targeting both desktop and mobile (but not loaded by default). On the server side, we would simply include the class files in /includes/libs/ (after a security review).

Example of client-side use
Add templates to a ResourceLoader module:


 * Shouldn't templates be precompiled? Performance will be infinities better. NRuiz (WMF) (talk) 13:47, 21 January 2014 (UTC)

Use them in JavaScript:

Example template file using Mustache syntax:

Links for the ones not knowing the topic so well

 * why twig, not citing mustache, mentioning smarty, and smarty3: http://fabien.potencier.org/article/34/templating-engines-in-php
 * saying zend_view should be in comparisons, mentioning smarty, the php "standard": http://stackoverflow.com/questions/731743/php-vs-template-engine
 * smarty3: http://www.smarty.net/v3_overview
 * mentioning mustache, twig, swig, jinja: http://stackoverflow.com/questions/11462211/jinja-like-js-templating-language
 * mentioning mustache, smarty, twig (then beeing used) when creating timber, wordpress themes: http://upstatement.com/blog/2013/10/comparing-php-template-languages-for-wordpresss/
 * mentioning twig beeing used for drupal now: https://drupal.org/node/2008464
 * mustache, mentioning implementation in many languages: http://mustache.github.io/
 * favouring mustache: http://blog.chrisworfolk.com/2012/07/04/logic-less-templates-with-mustache/
 * Swig (Twig in JS, smaller footprint than twig.js): http://paularmstrong.github.io/swig/
 * twig.js: https://github.com/justjohn/twig.js, http://showmethecode.es/php/twig/twig-js-plantillas-twig-en-el-lado-del-cliente/
 * the original mustache, in ruby, unmaintained: https://github.com/defunkt/mustache
 * more detailed description of Nirvana: Requests for comment/MVC framework
 * Research notes on DOM templating that can be used for content too