Parsoid/Token stream transformations

Overview and current status
The plan is to implement most wiki-specific parser functionality in token stream transformations, which are dispatched using a registration mechanism by token type. A token transform can perform the following actions on each token:


 * token deletion: aborts further processing for this token
 * token expansion: registered handlers for each of the returned tokens are called
 * token modification: If the token type is unchanged, pass the token to the next transformation for this token type. If the type was changed, call handlers for the new type.

The order of handlers can be specified using a simple prepend/append api for now. Syntax-specific transformations on a token can register for early processing, so that later transformations on a token can operate on a normalized version of the token. MediaWiki's special quote handling for italic/bold for example is implemented in a core extension that registers handlers for 'tag quote', 'newline' and the special 'eof' token. Lists and a simple version of the Cite extension are similarly implemented. A general emulation of parser hook behavior on top of the token stream should be quite straightforward. Either collected tokens between tags or plain text based on source positions noted in tokens are available.

The token transform dispatcher class is prepared for asynchronous processing of tokens, which is already used in a synchronous fashion for the back-reference behavior of the italic/bold extension. This ability to overlap operations on multiple tokens will be important for template expansions. Doing template expansions on the token level makes it possible to render unbalanced templates like the table start / row / end combinations for viewing, while encapsulating those if the output is destined for the visual editor. Template expansion is currently WIP.

Goals / desirable features for the token stream transform framework

 * Parallelism: Allow the tokenizer, token transformations, and tree builder to execute in parallel on separate cores. Single pass over a token stream with minimal buffering.
 * Concurrent IO: Support overlapping and batching of IO from transformations like template fetching.
 * Generality and modularity: Make it easy to plug in transformations for new features, input sources etc.
 * Backwards compatibility: Provide support for extension APIs through wrappers.

Design ideas for the next steps
Random thoughts, very much WIP. Feel free to edit or comment as you like!

Ordering of transformations in phases
The current implementation provides control over the order of handlers with simple prepend/append operations. A move to numeric priorities could enable a phased structure, in which each token passes through multiple phases in which only transformations starting from this phase are applied to it. This is especially interesting when a token is converted to a different token type (or multiple tokens of a different type), as the processing then needs to restart on the new tokens. Marking a phase on the token can help to make sure that transformations are only applied once per token in general.

Candidate phases:
 * 1) Input format specific conversions. Examples: MediaWiki list and quote handling
 * 2) General conversions. Examples: template expansion, link/image expansion, TOC extraction and section linking etc
 * 3) Output sanitation. Enforce tag / attribute whitelists

Template frames: arg dict, async barrier

 * arg expansion for templates: arg key passed in callback
 * arg-in-arg: substitute from parent frame
 * template-in-arg: new accumulators for plain tokens; new accumulator + frame for each template with parent set to accumulator, parent->addOutstanding to increment outstanding counter

Generic callback: parent.cb(tokens, reference, notYetDone=false)
returns null if done, new parent (same ref) otherwise
 * templateframe.cb(tokens, this.parentref) // key for title: null
 * accumulator.cb(tokens, this.parentref)
 * generic down args: key, parent

Accumulators with generic callback support
Accumulators should return chunks in-order but as soon as possible to the parent.

frame \ accumulator1      accumulator2		    accumulator3 parent: frame			parent: accumulator1		parent: accumulator2 frame: frame			frame: frame				frame: frame outstanding: 2		outstanding: 2				outstanding: 1

Accumulator todo:
 * change order to expansion node first, followed by plain tokens
 * now need way to access predecessor node
 * cannot edit predecessor if already written out
 * but *can* read written-out predecessor, should be enough (?)

Sample async completion hierarchy: root frame (empty args) template frame title accumulator1 (parent = templateframe, parentref = null) template frame (parent = accumulator1, parentref = null) accumulator2 (parent = accumulator1, parentref = firstparent)

Limits on expansion depth and loop detection

 * MediaWiki default limits the expansion to 40 by default, as xdebug limits the stack to 100 (see DefaultSettings)
 * Browsers seem to support stacks 500+ deep though, so tail call optimization for callback chains is not needed really:
 * loop detection: don't expand parent titles in children