Template Object Model

From mediawiki.org

Lets flesh out a basic HTML DOM templating interface here.

Requirements[edit]

  • Both a PHP and JS implementation
    • Resource loader needs to support loading templates
      • thoughts on speculative loading
      • also <template> elements in HTML5
    • tendency to move towards JS for client side (and later server-side) templating
  • if/else
  • array handling
    • foreach
    • detect an empty array
  • Template partials / transclusion
  • Ability to control HTML escaping (preferably with escaped output as the default)
    • Should be obvious to see what is escaped by looking at the template
    • Context sensitive escaping (attributes)
  • Extensible enough to accommodate MediaWiki i18n system
    • Should either be a custom implementation of the library or a predefined function/filter
  • Ensure well-balanced DOM throughout
  • Client-side library that has a small footprint (~20K or less)
  • should support visual editing (so no magic syntax in templates, only HTML, simplicity)

Interesting implementations[edit]

  • Distal -- Zope TAL in JS
  • Genshi -- cleaner than TAL, very close to what we are looking for, but Python implementation rather than JS
  • AngularJS -- attribute-based HTML templates compiled to code, reactive data binding with separate model, fully featured including sanitization etc
  • KnockoutJS -- single attribute-based (TAL-like) templating, reactive data binding
  • XHP -- FB XML literals in PHP; slow in Wikia testing, PHP only.
  • React -- pseudo DOM based, reactive data binding
  • Mustache -- Popular and simple text templating with minimalistic brace syntax. Not DOM, so no attribute escaping, nesting guarantees etc.

Spec strawman[edit]

In uncompiled templates, we can use user-friendly and compact syntax similar to mustache [1]:

<ol t:if="links">
  <li t:for="link in links">
    <a href="{{link.url}}">
      {{link.title|maxlength:"20"|case:"upper"}}
    </a> posted by {{link.username}} at 
    {{link.time|fmt:"%x %X"}}
  </li>
</ol>
<span t:if="!links">No links were posted.</span>

Transclusion can be defined with this syntax:

<ol t:if="links">
  <li t:for="link in links">
    {{> link | {link:link, color:"green"} }}
  </li>
</ol>
<span t:if="!links">No links were posted.</span>

This calls the template 'link' via a possible 'link' controller. The 'link' template:

<span style="color: {{link.color}}">
  <a href="{{link.url}}">
      {{link.title|maxlength:"20"|case:"upper"}}
    </a> posted by {{link.username}} at 
    {{link.time|fmt:"%x %X"}}
</span>

For visual editing, this can be de-sugared to a pure XHTML+RDFa template by replacing the brace syntax with elements and attributes suitable for visual editing:

<ol typeof="t:if" data-mw='{"if":"links"}'>
  <li typeof="t:for" data-mw='{"src":"link in links"}'>
    <a typeof="t:attribs" href="http://example.org" data-mw='{"attribs":[["href":"link.url"]]}'>
      <span typeof="t:txt" data-mw='{"src":"link.title","maxlength":20,case:"upper"}'>Example link</span>
    </a>
    posted by <span typeof="t:txt" data-mw='{"src":"link.username"}'>Paul</span> at 
    <span typeof="t:txt" data-mw='{"src":"link.time","fmt":"%x %X"}'>03/12/14 22:20:12</span>
  </li>
</ol>
<span typeof="t:if" data-mw='{"if":"!links"}'>No links were posted.</span>

Expressions[edit]

  • data references in dot and [] notation: item.url and array[0] or array["."]
  • iteration in <t:for>: item in array, k, v of object or key of object[2]
  • arithmetic: +-*/%
  • (in)equality predicates: < > == >= <=[3]
  • logical negation/and/or, parenthesis: ! and or ( )[4]

We considered also supporting function calls, but decided to go with data references only for now to keep porting straightforward, and maintain a clear separation between formatting options and data. If necessary, such support can be added at a later point.

Possible things to add to expressions[edit]

Things that could be useful in expressions as they are a property of a view rather than the model. This is mainly about things that can be used in conditionals and iteration specifications. Localized formatting (numbers, uppercase, plural etc) can be handled with properties. We don't implement these straight away to keep porting simple, but could consider adding these features later.

  • length of arrays, maybe array.length or len(array) -- to print the length, change layout depending on length
  • slicing on arrays -- Python syntax? array[0:5] -- to split up a long array in the view or only display the top n entries
  • calls of functions in the controller: can be used to implement more complex predicates, sorting etc
    • plain data access maps to the model, functions to the controller?
  • more math, as in the JS Math module contents?
    • could also be left to the controller

Things probably better implemented in the controller are

  • default sorting: complex, JS can re-sort client-side
    • should have access to locale in scope

Value formatting[edit]

Via attributes on the <t:txt> element. I18n behavior inherited by DOM scope.

  • Strings
    • maxlength
    • case= upper/lower
  • Numbers
    • fmt format string (precision, padding etc)

Escaping[edit]

The conservative default assumption is that any data passed into the template was user-supplied and needs to be escaped according to the DOM context:

  • In text content, everything is escaped to text
  • In attributes, the correct sanitization for the tag and attribute is applied after full expansion on the complete attribute value.

This safe default setting means that any user input is sanitized correctly for the context it is used in. There is however also occasionally a need to pass in data that is either already sanitized or would not pass user input sanitization for a good reason (JS links in the UI for example). This can be declared centrally in a 'bless' method in the controller which blesses exceptions for a specific context.

// The post comes from our API, which sanitizes on the way out
t.bless(model.post.html, 'html');

// The post link includes some javascript that we want, so construct it here:
model.post.href = "javascript:doSomething();";
// Normally this would be escaped when used in a href attribute. 
// We know that this UI link is fine however, so bless it for use in href attributes of a elements
t.bless(model.post.href, 'a.href');

Escaping is dependent on the context of its use in the template. In the following example, the html passed into the title attribute will be transformed to plain text, but the html expanded in the text content will remain HTML as it was blessed for this context.

<a title="{{post.html}}">Hello</a>
{{post.html}}

In content templating, no blessing is performed by the system.

Higher-level widgets[edit]

Basically like extensions.

<mw:plot type="pie" datasource="wikidata://foo"/>
<mw:table datasource="wikidata://foo" columns="1,4,5" sort="..."/>
  • Declarative with freedom of implementation
  • Rich support in VE possible
    • Drop-downs of options
    • Data type driven formatting (date/birthday formatted as localized age)
    • In-place editing of the underlying data within the rendering
  • Table widget alternative to data tables with table start / row / end templates
  • Both table and plot could come in handy in special pages

Similarity to AngularJS[edit]

The syntax here is very similar to that used by AngularJS. The AngularJS implementation includes

  • context-sensitive escaping very close to what we describe above, HTML sanitization
  • formatting filters ({{ number | currency }})
  • generic directive registry with choice between custom attributes and tags; tags make it possible to transparently add validation to simple html tags like <input>
  • localization incl. pluralization
  • expression parser that strikes a good compromise between flexibility and portability

It also provides something we don't need on the server side:

  • bi-directional data binding with dynamic updates
  • client-side form validation
  • animations
  • routing and scrolling for single-page applications

A problem with AngularJS as a simple client-side templating library is its size. It minifies+gzip-compresses to 29-38k rather than the ~3k for a simple string template library like Mustache. It does provide much more functionality of course, but that does not help much if that functionality is not needed. Bi-directional data bindings are also not always needed. There are one-shot alternatives for most directives that avoid the overhead of setting up watchers on a data model.

It might be possible to build a smaller version of AngularJS. Another option is to build a static template library that uses the same syntax / attributes and some of the code (expression parser, sanitization stuff).

Notes[edit]

  1. ↑ We don't have to worry about conflicts with literal { in content here, as templates are not normal page content. They are typically called from page content, and any literal { can be written as entities.
  2. ↑ Good ideas on how to integrate enumeration here? Distal for example defines a magic # var, but does not provide access to nested enumeration counters. Maybe something like index, item in array and index, key, value of object for objects?
  3. ↑ Alternative syntax to avoid XML issues in attributes (< needs to be escaped to &lt; in XML but not HTML5): lt gt == geq leq. Just parsing templates with an HTML5 parser (and possibly serializing that to XML) also avoids this issue.
  4. ↑ && is problematic in both HTML and XML

Other remarks[edit]

  • Should we stay with XML namespaces? -> requires definition when parsing as XML
    • AngularJS uses - instead: t-if, t-txt etc