Parsoid/Internals/data-parsoid

Temporarily in data-parsoid, but not in final DOM output
tsr: Tag widths for all tokens (from tokenizer)

extTagWidths: Width of opening and closing tags for extension tags. Ex: ,

Proposal: Make these temporary properties used till we lint the HTML (instead of emitting to final DOM output)
autoInsertedStart: whether this start HTML tag has no corresponding wikitext and was auto-inserted to generate well-formed html. Usually happens when treebuilder fixes up badly nested HTML.

autoInsertedEnd: whether this end HTML tag has no corresponding wikitext and was auto-inserted to generate well-formed html. Ex:, , , , etc. that have no explicit closing markup. Or, html tags that aren't closed

Proposal: Remove from data-parsoid and rely on selser to preserve syntax variations
selfClose: are void tags self-closed? (ex: vs )

noClose: void tags that are not self-closed (ex:  )

brokenHTMLTag: used to RT back these kind of tags: or  or 

srcTagName: source tag name (records case variations) for HTML tags. Ex: vs  vs 

startTagSrc, endTagSrc, attrSepSrc: source for start/end/attribute-text separators (used in table wikitext) pipetrick: true if the link was a pipetrick Foo (NOTE: This will likely be removed soon since this should not show up in saved wikitext since this is a pre-save transformation trick.)
 * |foo || bar
 * |foo || bar
 * | foo
 * |style='color:red;' | foo || bar

Proposal: Maybe move to data-mw?
stx_v: "row"  set for td/th cells that show up on the same line. Ex: |foo ||bar ||baz (Maybe use stx: for this as well)

stx:
 * "html" - set for html tags. Ex: foo
 * "row" - set for dt/dd that show on the same line. Ex: ";a:b" vs ";a\n:b"
 * "piped" - set for piped wikilinks with explicit content Ex: bar vs Foo
 * "magiclink"- set for magic links (RFC/PMID/ISBN) Ex: RFC 1234, ISBN 1234567890 (Not needed anymore?)
 * "url" - set for url links Ex: http://google.com (Not needed anymore?)

CSA: possibly for the future: add "stx_v" option to list items w/ an intervening double-newline (useful for talk page comments)

Required properties
dsr: Wikitext source ranges that generated this DOM node (start-offset, end-offset, start-tag-width, end-tag-width).

Consider input wikitext:. Let us look at the  part of the input. It generates. The dsr property of the data-parsoid attribute of this i-tag tells us the following. This HTML node maps to input wikitext substring. The opening tag  was 2 characters wide in wikitext and the closing tag   was also 2 characters wide in wikitext.

src: used to emit original wikitext in some scenarios (entities, placeholder spans)

tail: link trail source (Ex: the "l" in )

prefix: link prefix source

Other properties
a and sa: are used when the attribute source and rendering differ; in which case "a" contains the rendered attribute and "sa" contains the source attribute. When transforming back HTML to Wikitext, "a" is used to check whether the content of the attribute has been modified and, if not, "sa" is used to reserialize it as it was in the original wikitext (avoiding a dirty diff).

Example:  gets rendered to

pi: stands for "parameter info". When processing a template, this property contains the name information of parameters, whether named or not; it also contains the whitespace information surrounding the parameter and its value, if any. This is used when transforming back HTML to Wikitext: we want to keep the parameter order, names and spacing to avoid dirty diffs. This works with the  property in  :   stores the non-semantic information used to choose a specific formatting of the parameters, and   stores the semantic information needed for editing parameters.

Example: gets rendered to The   property in   array elements is a 4-element array and captures the whitespace seen around the   pair in the transclusion. Elements 0 and 1 are spaces before and after. Elements 2 and 3 are spaces before and after.