Content Transform Team/Weekly Updates

Week of Dec 6, 2021
Media Output changes in core


 * T287965: Print styles are fixed


 * Inline images & alt text handling: See T297443

Parsoid integration with core


 * Exploring adding setFunctionHook support to Parsoid - related Parsoid SiteConfig fix along the way

Extension Updates

Linter


 * Patch to display all lints for a single page in gerrit

Translate


 * Split deployment into two pieces. With wmf.12, only html->wt support was introduced to add forward compatibility for Parsoid HTML version 2.4.0. Translate support will be rolled out on the next train.
 * wmf.12 train got rolled back

InputBox


 * First proof of concept patch in gerrit; progress now requires discussion about ParsoidExtensionAPI

SyntaxHighlight


 * Exploration of why SyntaxHighlight cares about strip state in phabricator. Parsoid's behaviour is more reasonable overall but might need a temporary workaround to deal with Scribunto's use of this mechanism.

Performance


 * Tim's last patch merged and sent to rt testing - regressions found and need to be investigated and fixed before it can be deployed (in the new year)

Visual Diffing


 * Regressions in wmf.12 in image layout (bottom border) between core & Parsoid. Almost 5% drop in test pages without rendering diffs
 * This is mostly something that pops up in visual diff testing more readily but impacts are subtle on wikis that will mostly not be noticed by readers or editors.
 * Regression has been fixed in core and merged - will ride the next train.

Maps


 * Maps 2.0 stack has been rolled out to frwiki - no complaints and everything stable

Everything else


 * Filed T297259 for ServiceOps to run some perf benchmarking for us with newer hardware to estimate what hardware changes might be beneficial when Parsoid is used for read views on all wikis
 * C.Scott (with Subbu's input) presented updates from the Parsoid / wikitext parsing world at SWMCon 2021
 * WIP to look at better CI and parsertests support for extensions that are updated to work natively with Parsoid APIs

Week of Nov 29, 2021
Extension Updates

Translate


 * Annotations support rolling out to production in next week's train

Linter


 * All lints for a single page patch nearing completion

SyntaxHighlight


 * Initial explorations to have it work with Parsoid's Extension API directly

Maps


 * To deploy follow-up patch regarding label cut on Tegola

mobile-html Services


 * Issue with graphs came up on phabricator T285093

Week of Nov 22, 2021
Parsoid integration with core


 * ContentMetadataCollector interface: Basic patch merged in Parsoid

Performance


 * TIm's autoInserted* flag detection via Remex patch cannot be merged till new train rolls out to production to update Remex version on scandium

Week of Nov 15, 2021
Parsoid integration with core


 * First phase of ContentMetadataCollector should land this week (just a few methods left to audit) - might be underwhelming since most of the 'exciting' methods got punted to phab tickets

Extension Updates

Translate


 * RT testing showed a few issues, most of them corner-case-y; all the ones we found either in phab or need to be fixed on pages

Performance


 * TIm's autoInserted* flag detection via Remex - patch in gerrit for review. CPU and memory benefits expected with rollout

Other


 * Subbu met with SRE Data Persistence to discuss ParserCache capacity needs for Parsoid Read Views. TLDR is that after recent server upgrades, ParserCache has ~30% utilization and should be able to support Parsoid's HTML as well as long as we rollout to wikis in stages.

Week of Nov 7, 2021
Media output changes in core


 * FAQ edited and approved

Extension Updates

Translate


 * Ran RT-testing, examined regressions and filed patches to fix them. Followups needed.
 * Dirty diffs related to newline changes could impact translate behavior and needs investigation.

Performance


 * No new updates. Tim busy working on PHP-VM bug

Maps

Maps v2: T263854


 * Most of the tickets are resolved - the ones not resolved are either low priority or docs related
 * testwiki now is connected to tegola backed kartotherian source
 * Resolved some event related issues
 * cronjob to trigger invalidation on OSM syncs
 * kafka concurrency
 * when we scaled workers kafka didn't allow concurrent consuming
 * envoy + tegola k8s reliability issues


 * Re-introduced batching in tegola pregeneration scrips
 * Next steps
 * Test pregeneration with production load
 * Roll out to more wikis

mobile-html Services


 * Mobile Preview problem statement submitted for preview - T295348

Other


 * Filed TOC Incident report
 * Discussed ParserCache implications of ParserOutput work with Amir (database arch)

Week of Nov 1, 2021
Media output changes in core


 * Started working on the FAQ for the rollout, please add questions you want to see there

Extension Updates

Translate


 * Annotations patch merged. Three bugs identified via rt-testing. Investigation done. Patches Soon.

Performance


 * Tim is working to get rid of the start/end meta addition to detect tree builder fixups and register handlers (via subclassing) with Remex to listen to treebuilder events. This has the potential to cut processing and memory if it works out.

Visual Diffing


 * Something seems to have improved arwiki results a bit in the latest run

Maps


 * FYI: WMDE is submitting some patches to Kartographer as part of their tech wishlist
 * Still working on tile pregeneration

mobile-html Services


 * Phab task to track Dark Themes Preview - T295299

Other

Production incidents


 * Regression in ToC output caused firedrill Friday
 * Should figure out how to pass __NOCONTENTCONVERT__ and some other properties to ParserOutput
 * Should document proper mechanism for ParserCache updates
 * Maybe zhwiki needs to be group 1 instead of group 2
 * Proper versioning for ParserCache would be helpful. (Also RestBASE.)
 * Sanitizer interactions with tags, needs followup this week (toc and translate)
 * Follow up to rt testing interaction with mediawiki-vendor as well