Parsoid/Parser Unification/Media structure

See the FAQ for more details about the state of this project. Also, the project workboard for the work currently in progress.

Intro
As we take steps towards converging with, and eventually replacing, core's legacy parser, one hop along the path is unifying the structure of media output. This is proposed in T118517, which implements T51097.

Two patches make up the bulk of the work:

The second patch is hidden behind a config flag. We are yet to add a feature flag so that we can roll this out on a per wiki basis. T266148 tracks that, with T271129 for testing the different modes.
 * https://gerrit.wikimedia.org/r/#/c/410362/ - Move parsoid media styling to content (MERGED)
 * https://gerrit.wikimedia.org/r/#/c/507512/ - Emit media structure as piloted in Parsoid (MERGED)

Work history / log
In T251641, it was decided to revert from using a custom element for inline media, which Parsoid had deployed several years back, and instead use a. T266143 tracks Parsoid clients adding support for both so that Parsoid can change its output in,


 * https://gerrit.wikimedia.org/r/c/mediawiki/services/parsoid/+/623906 - s/figure-inline/span/

Parsoid claims to render identically while adding more semantic elements to the markup (ie. the use  and , instead of generic  s). In order to verify correctness, it has undergone several rounds of visual diff testing, as well as being the basis of the Visual Editor, which susses out many rendering differences. Another round of visual diff testing is scheduled in T266149.

Nevertheless, new bugs are still being discovered,


 * T193695: Horizontal alignment of media in Parsoid CSS has too much margin when "thumb" isn't present
 * https://gerrit.wikimedia.org/r/#/c/430629/ - Only apply tright/tleft margins to frame/thumb
 * T269704: Default horizontal alignment should depend on content language, not the UI
 * See the confusion here, https://gerrit.wikimedia.org/r/#/c/196532/18/includes/Linker.php

There also remains some known open questions about the output,


 * T171761: Figcaption overflows image width on unbroken words
 * https://gerrit.wikimedia.org/r/#/c/430102/ - Set break-word on figcaption
 * Maybe this is an indication that we should switch back to styling the figcaption as a table-caption, and always emitting it so that the bottom border is present
 * Adding the figcaptions always could be useful regardless of switching back to the old css
 * T169975: Missing images render as broken img tags, not redlinks -- this is only an issue with Parsoid output, not with the changes to core.
 * Specs/HTML/2.1.0#Missing media
 * T272186: Extension:ImageMap appears to do regexp post-processing of image media HTML, probably needs an update. Are there other similar extensions?
 * Native ImageMap extension in https://gerrit.wikimedia.org/r/c/mediawiki/services/parsoid/+/585344
 * T268250: Decide on a structure for galleries
 * Galleries are current a div soup with inline media. The above task is about moving to a list of figures.
 * T270150: The css we're shipping needs stability and performance review
 * T271114: Update site specific css for new media structure

Finally, there is the need to proselytize this change in the community,


 * Come up with a story for how gadgets, user scripts, and other downstream tools will be migrated. See the notes below.
 * T113258 : Draft email announcement about proposed change to output from legacy parser for media

Migrating Gadgets and User Scripts
Gadget usage on wikis: https://usage.toolforge.org/

Some notes from Parsing/Notes/Figure_for_Media:


 * Special:GadgetUsage is our friend.
 * Plus, mwgrep on wikis in the javascript namespace.
 * most gadgets: don't necessarily inspect HTML -- 30% maybe use actual HTML most things might look for ids ...
 * taking an inventory of gadgets
 * no reliable way of knowing how user scripts
 * a few gadgets everyone uses: popup, hotcat, (5 or so) ... ppl post in village pump in < 30 mins if they break
 * some 20 or 30 that a few more ppl use and will take a while to notice
 * last category: used for specialized processes .. about 100
 * page lists a lot of them
 * https://en.wikipedia.org/wiki/Wikipedia:User_scripts
 * commons has quite a lot; wikidata a few
 * hardest part is fixing on other wikis where things are copied over to other wikis
 * good to maintain documentation about what we fixed could help

The Reading Web Team has provided some additional info on how they've been making changes for the Desktop Improvements project.


 * generally relying on user notices (tech news) when making breaking changes, like this one
 * identifying potentially impacted pages and pinging users that own them
 * pages can be identified using the global-search.toolforge.org and using a script to parse out usernames
 * for JS breaking changes, they have client-side error logging
 * this helps with gadget JS errors but doesn't help with CSS errors
 * may need to patch broken scripts ourselves
 * the process for communicating broken gadgets and finding the owners still needs improvement to be effective