Parsoid/Parser Unification/Pixel Diff Testing Stats

In order to compare rendering fidelity between Parsoid and core parser, we are going to be doing pixel diff testing on a subset of pages from various wikis and monitor progress. Initially, we are starting small (~25K pages across ~20 wikis) and eventually we will expand the test set. In these initial stages, we are likely going to get a number of false positives as we iron out wrinkles in the testing infrastructure.

Known issues and testing-related workarounds

 * Test timeouts: If there differences between core parser rendering and Parsoid rendering on a page is large, the diffing algorithm (uprightdiff) might take too long or too much memory in which case the test run on that page will not complete since the tests are given a fixed time to complete (~5 mins). That is the reason you see < 100% test completion rate. As we fix sources of diffs, this test completion rate should naturally improve since there are fewer diffs and the diffing algorithm is likely to run to completion.
 * Unstyled Parsoid output: Parsoid's output is unstyled. The testing infrastructure loads the vector skin styles and applies it to the output. While this is now mostly working, there may still be areas where the right styles may not apply because of HTML structure differences between Parsoid and legacy HTML. For example, this is the case with HTML output for media. Legacy parser will soon be updated to emit Parsoid-compatible HTML structures for media and this will eliminate some of these CSS diffs we currently see in these visualdiff test runs.
 * Known Cite CSS diffs: Parsoid generates identical HTML for refs and references across all wikis and relies on CSS to generate varied styling across wikis. On the core parser side, the core Cite extension actually generates varied output for different wikis and is not CSS-based. Parsoid's approach is better suited for editing clients, but we haven't yet done the work of identifying the precise CSS needed to emulate the output on all wikis. This work is captured on Phabricator. Pixel-diff testing will let us isolate these differences and fix up the CSS.
 * Known missing JS modules in Parsoid output: The testing infrastructure attempts to expand all collapsed content on pages before comparing output. Parsoid's output is currently uncollapsed by default (because of missing JS modules in Parsoid's output) and while some JS scripts attempt to expand collapsed sections, this is error-prone currently and doesn't capture all collapsed content and so we get large false positive differences when the collapsed state is different in the two screenshots.
 * Other workarounds / issues:
 * span wrappers around various pieces of content (entities, display-space for frwiki, nowiki, template content text nodes for about-id continuity) cause minor pixel-level discrepancies in rendering which human readers won't notice but which introduce a lot of noise in visualdiffing. This seems to be something new that has been showing up since Sep 2021 after some library upgrades on the server.
 * wrappers need to be removed from Parsoid output for some CSS query selectors to apply.
 * Cite errors are handled different in Parsoid & core
 * jsconfig vars added by extensions are added all at once in legacy parser and one tag at a time in Parsoid and this leads to the last tag-instance overwriting everything else in Parsoid.
 * enwiki:World Flags has a number of  uses which generates a HTML structure identical to auto-generated table of contents and so in visual diff testing, when we suppress TOCs, all of these get suppressed causing visual diffs.
 * On enwiki:Podgorica, there is an extra space before [citation needed] in Parsoid HTML but not in legacy HTML. Parsoid's output matches what is seen in source ( -- see the space after the period there). So, not sure why this is being stripped in legacy parser output.
 * Support for Special:Prefixindex is missing in Parsoid
 * sol state in extensions. Parsoid treats content in extensions as having SOL state. But, legacy parser doesn't. It transfers document sol state to extensions affects how and is parsed. Former is not a list in legacy, but latter is. Both are lists in Parsoid.

As we resolve these and other issues in the coming months, we will remove them from this list.