Parsoid/Parser Unification/Media structure

As we take steps towards converging with, and eventually replacing, the current php parser, one hop along the path is unifying the structure of media output. This is proposed in T118517, which implements T51097.

Three patches make up the bulk of the work (which are currently awaiting review):


 * https://gerrit.wikimedia.org/r/#/c/196532/ - Use figure and figcaption HTML5 elements for media
 * https://gerrit.wikimedia.org/r/#/c/370206/ - Use custom figure-inline HTML5 element for inline media
 * https://gerrit.wikimedia.org/r/#/c/410362/ - Move parsoid media styling to content

Parsoid claims to render identically while adding more semantic elements to the markup (ie. the use  and , instead of generic  s). In order to verify correctness, it has undergone several rounds of visual diff testing, as well as being the basis of the Visual Editor, which susses out many rendering differences.

Nevertheless, new bugs are still being discovered,


 * T193695: Horizontal alignment of media in Parsoid CSS has too much margin when "thumb" isn't present
 * https://gerrit.wikimedia.org/r/#/c/430629/ - Only apply tright/tleft margins to frame/thumb

There also remains some known open questions about the output,


 * T171761: Figcaption overflows image width on unbroken words
 * https://gerrit.wikimedia.org/r/#/c/430102/ - Set break-word on figcaption
 * Maybe this is an indication that we should switch back to styling the figcaption as a table-caption, and always emitting it so that the bottom border is present
 * T169975: Missing images render as broken img tags, not redlinks

Finally, there is the need to proselytize this change in the community,


 * Come up with a story for how user gadgets and other downstream tools will be migrated


 * T113258: Draft email announcement about proposed change to output from PHP parser for images