Reading/Web/Projects/A frontend powered by Parsoid
|IMPORTANT: The content of this page is outdated. If you have checked or updated this page and found the content to be suitable, please remove this notice.|
We will do this by:
- Running experiments to assess impact of changes - both to users and to our servers
- Building an HTML prototype to get Parsoid to feature parity with our existing PHP parser for our readers
- Explore using Service Workers to render more on the client and reduce load on our servers.
- Eventually roll these changes out to our mobile beta users, a small audience which will give us more confidence in our changes.
- Roll these changes out to all mobile users for even more confidence and a faster reading experience for everyone
- Adapt this for the desktop site to give a faster reading experience there
This work is captured in phabricator.
- 1 Motivation
- 2 Related
The ability to do DOM transformations cheaply
This also becomes extremely important in Wikipedia Zero. Some networks provide Wikipedia for free but only without images to save on bandwidth and serve their users more. We currently do this in a hacky way, which has broken due to the code being bespoke . We need to serve these users better. We don't want to be breaking stuff and confusing users or operators unnecessarily.
Sections as first class citizens
This is tracked in phabricator.
The mobile website is currently powered by the MediaWiki default PHP parser. The mobile site differs from the desktop site in that it has a concept of sections on the HTML level. It achieves this via a piece of code called the MobileFormatter that runs after parsing. This code is brittle and uses regular expressions. It should really be done at the parser level. This is best done in a language that understands the concept of the DOM tree. This is one of the reasons we believe we need a frontend powered by Parsoid.
Sections becoming important when you think about performance. We can delay the rendering of much of our content. For those users that do not read beyond the lead section, they could retrieve pages in the realms of two minutes quicker  if they were just served the lead section.
Any content before the highest level heading in the content is referred to as a lead section. The highest level heading is used to mark up sections. For instance if a page features h2s (two equal signs in wikitext), these become the highest level headings. Content in between these headings is wrapped in a div to support section collapsing.
Example lead section:
Lead section content. == big heading ==
More unusual example of lead section:
Lead section content. === I am also part of the lead section as I am a smaller heading then "big heading" === This is also part of the lead section. == big heading ==
Mobile view api
The mobile view api allows the surfacing of sections as first class citizens via the API. Again it resorts to regular expressions that run after parsing.
More non-MediaWiki based clients
We want to be able to support native apps, node js applications built on top of MediaWiki. So far we have worked hard on our APIs to support editing but the same cannot be said for people building interfaces for reading. Currently our API serves page content that developers must santize for their own needs - whether that be section wrapping to support section collapsing or removal of non-mobile friendly/inconsistently designed navigation elements e.g.
When clicking edit on any MediaWiki page, VisualEditor must do a roundtrip to the server to ask Parsoid what the content of the page is. Theoretically if the page content was already built via Parsoid it could skip this altogether and use the content in the DOM. This would give editors a faster editing experience.
Our apps and web experience need to be more closely aligned. Many of the hacks in place in MobileFrontend and apps are currently bespoke and would be better done in the same level - Parsoid.