Parsoid/PostProcessor:DOM Tag Minimization

From mediawiki.org
Parsoid DOM Tag Minimization.Algo Sketch

The image on the right is the thumbnail of the paper sketch of the algorithm currently implemented in Parsoid. This is implemented as a post-processor to minimize tag use (maximizes tag overlap, merges adjacent identical tags). The sketch is the best way to understand the algorithm. This is currently applied to a set of 4 HTML tags (B, I, U, and S), but can be extended to other inline tags.

Example 1:

<b><i>BI</i></b><i>I</i>

gets restructured to:

<i><b>BI</b>I</i>

Example 2:

<b><i><u>BIU</u></i></b><u><i>UI</i></u><i>I</i>

gets restructured to:

<i><u><b>BIU</b>UI</u>I</i>