Parsoid/DOM notes

This document discusses well-formed and well-balanced requirements of template output in the context of efficient editability within VE and efficient reparsing of pages in Parsoid.

Editability

 * Edit transclusions in VE without using wikitext.
 * Easily preview updated transclusions in VE after edits.

Performance

 * Reuse prior transclusions from cache.
 * If possible, have it be a drop-in replacement rather than requiring further processing (currently, it gets placeholder tokens in the stream and goes through handlers, and is unpacked in the end).

What gets in the way
There are two problems that make this difficult: (a) wikitext templates need not output well-formed DOM trees (b) even when well-formed, the output may interact with the page that modifies the structure of the page beyond the insertion point (because of nesting constraints implemented in the HTML5 parser). Problem (a) is discussed below and Problem (b) is discussed in the section that talks about efficient re-rendering on edits.

Wikitext templates are all string-based rather than DOM-based.
 * Template output need not be a well-formed DOM tree. They can affect arbitrary amount of surrounding context in the page in which they are included.
 * Often, a mix of (1 or more) transclusions, top-level page wikitext, and maybe extension output together form a well-formed DOM subtree.

Editability

 * With mixed transclusion output and page content, editability of page content in the mixed output has to be done in wikitext mode (currently not supported, and hence uneditable).
 * When transclusion args are changed, the changes can leak out of the DOM structure to other sections, and in the worst case, an entire section might have to be re-rendered (assuming sections are hard boundaries for well-balanced trees).

Performance

 * Hampers ability to reuse HTML of prior expansions.

Longer-term strategy for fixing templates
One or more of the following:
 * Gradually move to DOM-based templates.
 * Consider separating data from presentation. For example in large tables, there is a lot of repetitive wikitext that serves no purpose except to introduce syntax errors, foster-parentable content, etc.
 * Use new-extension tags like to wrap transclusions and other output that collectively produce a well-formed DOM tree but individually do not.
 * For the rest, enforce well-formedness of output from text-based templates that are seen bare on the page. So, if a template produces foo, modify output to foo by inserting the closing tag.
 * Maybe provide new wikitext sugar/tools/syntax that makes it easier for template authors to write templates that produce well-formed DOM trees.
 * Edit templates and use bots to fix uses where possible to minimize cases that require wrapping.

Considerations

 * Acceptability of solution by editors
 * What do we do about all the old revisions which we cannot go in and edit / fix / wrap in extension tags?

Efficient re-rendering on edits
For this section and the rest of this document, let us assume that the output of a template is always a well-formed DOM fragment (a forest of adjacent DOM trees).

Given a page P, let F be a DOM fragment that corresponds to the transclusion of a template T. Let N be the container node within which F gets inserted. There are three edit scenarios that we have to consider: In Parsoid, when reparsing P, currently F is converted to representative wrapper tokens which then participate in various transformations (indent-pre wrapping, list creation, p-wrapping, etc). During post-processing of the DOM, F is unwrapped and inserted into N. This technique will let us handle scenarios 1 and 2. But, without additional guarantees/constraints on  F'  and the container node N, VE won't be able to just take  F'  and drop that in place on the client-side. In the worst case, it will require a serialize + reparse to get HTML nesting constraints (as implemented in the HTML5 parser) exactly right.
 * 1) Page P is edited to  P' . Output of F is unchanged. How do we reuse F from P when parsing  P' ?  This is the common workflow for Parsoid on page edits.
 * 2) Page P is unchanged. Template T that produces F is changed which now produces  F' . How do we now re-render P to incorporate  F' ?
 * 3) Page P is edited in VE. Parameters to T are modified in VE which changes F to  F' . How does VE re-render P to incorporate  F' ?

In general, the acceptability criteria is whether P' == HTML5.parse((P' = P.replace(F, F')).outerHTML). If yes, then  F'  can be dropped into N (both in Parsoid and VE) without any additional analysis or transformations. But, this check, while sufficient, is quite expensive and unrealistic to do on every template edit. we can improve on this by defining DOM scopes and exploiting them and/or by enforcing additional constraints on template output (F) and on nodes where they can be used (N). Both solutions are discussed below.

Approach 1: Re-rendering DOM scopes where necessary
If we introduce a notion of self-contained DOM scopes in a page P (within which all DOM trees are balanced and changes dont leak out to surrounding scopes), then VE would have to (either itself or with Parsoid's help) re-render the closest enclosing DOM scope that contains N, the container node for F (and  F' ).

A DOM scope is a forest of adjacent top-level DOM trees within a page P that are balanced to full DOMs on their own independent of surrounding context, and for which, replacement DOMs can be dropped-in without any analysis or transformations.

For example, wikitext sections are natural independent DOM scopes within a page. But, more generally, all direct children of the tag of P could also act as natural DOM scopes under certain conditions. For the purposes of this discussion, let us only consider wikitext sections to be DOM scopes.

Currently, Parsoid treats certain kinds of extension content, image captions, and link targets as independent DOM balancing contexts. However, they can still affect surrounding page context depending on the DOM tree ancestor nodes within which the output is inserted. For example, even after balancing output of link target independently, if it contains an , it causes a restructuring of the parent  and introduces new sibling nodes there. So, for the purpose of DOM fragment reuse, in the general case, it is not possible to guarantee drop-in replacement except for top-level nodes of P (children of ). However, in certain constrained contexts and with some knowledge about the DOM fragment F (or F'), and its container node N, we can do better than that.

Examples where template edits localize changes in N: Examples where template edits can cause DOM changes outside N: In these cases, VE would then have to request Parsoid to re-render the enclosing DOM scope (or VE would have do this itself on the client side). As long as we always have the fallback solution of dealing with the enclosing DOM scope for the DOM fragment F (or  F' , as the case may be), both Parsoid and VE can then use the simple drop-in solution in certain scenarios.
 * 1) N =  and F' =
 * 2) N =  and F' = 
 * 3) N =  and F' = plain text
 * 1) Attempting to add a wikilink in the link text of another link (extlink, wikilink).
 * 2) Attempting to insert a template that produces an ordered list item inside an unordered list  (Ex:  )
 * 3) Editing a transclusion that used to produce a single list item to now produce a list item and plain text on a new line. (Ex:  )

The acceptability of this solution will crucially depend on the proportion of template-edits where VE can efficiently determine whether it can replace F with  F'  directly. If a large number of templates generate sensible output and edits in VE don't introduce drastic changes to the HTML, it should indeed be possible to do this. In scenarios where it is not possible to do this, the performance penalties (slow updates on edit) of using non-conforming templates or using templates in non-conforming ways should provide feedback for template authors and users to minimize/modify such uses.

This solution could then be an interim step on the way to stricter enforcement of constraints on what templates produce and where they are used.

Approach 2: Enforcing nesting constraints on template output
Since it is expensive to test HTML5 nesting constraints on every re-render (for any of the three scenarios listed earlier in this section), an alternative approach is to identify suitable constraints that can be applied to template output independently (without accounting for transclusion context) and enforce them. This could work in a number of different ways.

One strategy is to record attributes about template output F (ex: plain-text, inline-elements-only, has-block-tag, list-item-only, generates-a-tags) and use those attributes to enforce constraints on the container node N in which it will be used. So, if F has the generates-a-tags flag, and N is an a-tag, then the template cannot be used inside N. In the context of VE, VE might flag an error to the user preventing the edit from taking place. For example if N is an -tag, VE might prevent any edits on F that generates an -tag inside (because it violates HTML5 nesting constraints). However, in the context of reparsing a page on page edit (scenario 1), it is a bit more complicated since wikitext edits is still string based and could violate constraints willy-nilly. How do we enforce constraints when  P'  has to be re-rendered?

Another strategy would be expand the scope of template wrapping to include page content that would have to modified/edited as a unit which can then be drop-in replaced as an unit. This unit could roughly correspond to a DOM scope as in the previous section, but the analysis could identify a smaller scope on which we can guarantee drop-in replacement in the face of edits. But, this will still require enforcing constraints on what kind of edits can take place within the scope.

TODO: Work through some examples.

TODO: Discuss potential issues and difficulties. (Ex: echo template; wikitext edits outside VE can violate constraints arbitrarily -- constraint-enforcement doesn't provide good options in a drop-in scenario .. we might have to actually rely on wrapper-tokens to re-render and modify encapsulation).

TODO: Discuss advantages