Parsoid/Internals/data-parsoid

First cut documenting data-parsoid flags that are used by Parsoid for clean round-tripping. This can be changed without notice and is purely Parsoid-internal information. Clients relying on any of these fields for functionality can potentially break as we change this.

Temporarily in data-parsoid, but not in final DOM output
tsr: Tag widths for all tokens (from tokenizer)

tagWidths: Width of opening and closing tags for extension tags. Ex: ,

Proposal: Make these temporary properties used till we lint the HTML
autoInsertedStart: whether this start HTML tag has no corresponding wikitext and was auto-inserted to generated well-formed html. Usually happens when treebuilder fixes up badly nested HTML.

autoInsertedEnd: whether this end HTML tag has no corresponding wikitext and was auto-inserted to generated well-formed html. Ex:, , , , etc. that have no explicit closing markup. Or, html tags that aren't closed

Proposal: Remove from data-parsoid and rely on selser to preserve syntax variations
selfClose: are void tags self-closed? (ex: vs )

noClose: void tags that are not self-closed (ex:  )

brokenHTMLTag: used to RT back these kind of tags: or  or 

srcTagName: source tag name (records case variations) for HTML tags. Ex: vs  vs 

startTagSrc, endTagSrc, attrSepSrc: source for start/end/attribute-text separators (used in table wikitext) pipetrick: true if the link was a pipetrick Foo (NOTE: This will likely be removed soon since this should not show up in saved wikitext since this is a pre-save transformation trick.)
 * |foo || bar
 * |foo || bar
 * | foo
 * |style='color:red;' | foo || bar

Proposal: Maybe move to data-mw?
stx_v: "row"  set for td/th cells that show up on the same line. Ex: |foo ||bar ||baz (Maybe use stx: for this as well)

stx:
 * "html" - set for html tags. Ex: foo
 * "row" - set for dt/dd that show on the same line. Ex: ";a:b" vs ";a\n:b"
 * "piped" - set for piped wikilinks with explicit content Ex: bar vs Foo
 * "magiclink"- set for magic links (RFC/PMID/ISBN) Ex: RFC 1234, ISBN 1234567890 (Not needed anymore?)
 * "url" - set for url links Ex: http://google.com (Not needed anymore?)

Still needed
dsr: Computed DOM source ranges on the DOM (start, end, start-tag, end-tag widths)

src: used to emit original wikitext in some scenarios (entities, placeholder spans)