Parsoid/Normalizations

While serializing (html2wt), Parsoid performs a number of normalizations, some behind a  flag.

Most can be found in wts.normalizeDOM.js

Default
These are the normalizations that Parsoid performs by default.
 * Tag minimization ( / tags)
 * Serialize invalid  tags to text
 * Enforce single-line context (in headings and lists)

scrub_wikitext
These normalizations are enabled if the  parameter is passed to the Parsoid API. Other normalizations that work around issues in Parsoid / VE+clients as a simpler solution for generating clean wikitext (at least for now)
 * Strip empty headings and style tags (only performed on new nodes)
 * Tag minimization ( tags, when at least one is new)
 * Whitespace at the start of paragraphs
 * New links that end in spaces
 * New table cells starting with escapable prefixes
 * Force category links and behaviour switches to serialize before/after headings (only performed on new nodes)
 * Strip  tags in headers (introduced by Parsoid in some paragraphs which when converted to headings in VE stick around)
 * Strip trailing &lt;nowiki/&gt; from wikitext lines (this one will be unnecessary once Parsoid stops introducing these)

Tag minimization ( / tags)
and

Force category links and behaviour switches to serialize before/after headings
and

Serialize invalid  tags to text
and

Enforce single-line context
and

However, newlines in transclusion parameters are preserved.

Strip empty headings and style tags
Normally, but with scrubbing it's all dropped.

Tag minimization ( tags)
and

Whitespace at the start of paragraphs
These nowikis are to prevent roundtripping as preformatted text.

New links that end in spaces
The nowiki here is to prevent link trails.

New table cells starting with escapable prefixes
// normally serializes to

// but with scrubbing becomes

Related links

 * w:he:WP:VE/nowiki
 * w:fr:Wikipédia:ÉditeurVisuel/Avis/Nowiki