Core Platform Team/Initiative/Unify Parsers-Phase 2/Initiative Description
MediaWiki currently has two wikitext parsers: the (legacy) parser and Parsoid supporting different use cases. This project aims to arrive at a single parser that supports all use cases.
- Significance and Motivation
Parsoid was developed to support HTML-editing clients but is also used by some read view use cases but not all of them. It is not tenable to have two parsers in the long term since it hamstrings development and upgrades to the parsing codebase, wikitext, and templates since we would have to add that support to both codebases. More importantly, the parsing pipelines in the two parsers are different which makes replicating functionality in both parsers more complex.
We would like to consolidate behind Parsoid as the new default parser given its support for HTML clients, annotated HTML output, and more structured internal pipeline. This requires identifying all output and feature incompatibilities between Parsoid and the legacy parser and bridging those gaps. This may also require updating (a) bots (b) gadgets (c) extensions (d) wikitext. This project aims to minimize all such changes by handling any differences with appropriate tooling and support.
Once Parsoid is deployed as the default and only parser for all wikitext-based use cases, we can embark upon much needed work to enhance wikitext and templates and make them easier to use, more performant, less error-prone, and easier to write tools for.
- Baseline Metrics
- Target Metrics
- Client teams (Web, VE, Flow, CX, Apps)
- Bot, Gadget, and Extension authors (only as pertaining to the Wikimedia cluster initially)
- Editing community
- Core Platform
- Known Dependencies/Blockers