User:YuviPanda/Wikipedia app reboot

From mediawiki.org

This page contains technical notes for the reboot of the Wikipedia app as a purely native application - on Android and iOS. It currently concentrates on only things that are needed for the reader experience, but a view towards being extensible enough that adding contributory features should not be that hard. It's also open enough that if / when we need to fall back to screen scraping the mobile web site (should be avoided!), it is possible too.

Content Fetching[edit]

The page contents will be fetched from Parsoid, rather than the MediaWiki API's action=mobileview. There is currently a hack to get the output of parsoid (via action=visualeditor on the API) - but Gabriel tells me that there will be publicly available API endpoints in the next month or so.

Parsoid produces HTML that is annotated with lots of wonderfully useful metadata about the information. Full information about the metadata is available in the DOM API Spec. It lets us do things like detect images, infoboxes, citations, warning templates, etc - in a really structured, stable way. This allows us to be super flexible in how we want to present the information - natively or as part of the content webview.

Content Display[edit]

This will be just a regular WebView control, that mostly just displays HTML styled with CSS. JS will be minimal for now - acting simply as a glue to bubble events back up to the native code. We will be loading CSS dynamically, via RL. We can specify RL modules specifically for the app in a new MobileApp extension - that does nothing but define RL modules. This lets us change CSS without having to update the app.

  • As an alternative or supplement to an app-specific RL target, we could also consider a "content" target that specifically aims for styles belonging in the content area. Core and extension features could then divide their styles into chrome/skin styles (to display on desktop and mobile UIs) and content styles that are always served, including with API output...?

Content to display pipeline[edit]

Actions that happen when the user requests to see a page.

  1. Hit the Parsoid API to fetch the HTML
  2. Parse the Parsoid HTML to rip out metadata that we want, and also perhaps simplify it to make it easier for the Webview to render.
  3. Load the HTML in the WebView
  4. Request Mobile App Specific Content CSS from RL (heavily cached)
    • Is there a good way for us to get site-, page-, and extension-used-on-this-page-specific CSS? I can envision extensions invoked by parser/parsoid specifying their RL modules or something on output, or sending a full RL CSS-loading URL with the output data, or something?
      • We can specify a RL 'app' target that should make this trivial. This ties into larger concerns of separating content CSS from UI CSS - how do we get just the Content CSS for some extension (math, for example), without the UI?
  5. Profit!

Communicating from WebView to Native Code[edit]

We'll need to define a very, VERY simple message passing API that lets the JS in the WebView communicate with the Native Code and vice versa. Crossing this boundary should be minimized, both for performance and code sanity reasons. The WebView APIs on Android and iOS make it easy for the native side to send messages to JS by executing a script fragment; JS can send messages back to native side by navigating to a special fake URL. We want to keep all this much simpler than something like Cordova.

Message[edit]

The only thing that should be able to pass between the barrier should be objects that are Messages. They have a type and a payload, and nothing more. payload is essentially a JSON structure, and hence can represent anything JSON can represent. type is used to route messages between various endpoints on both the native and WebView side.

Sending Messages[edit]

Sending messages should be trivial. There can be an object somewhere with a sendMessage method that takes type and payload as parameters. It is then routed to the appropriate location on the other side of the barrier based on the type. There are no multicast messages.

Receiving Messages[edit]

Receiving them is also trivial. There should be a registerReceiver method on an object that code can call with a type and an associated callback. If there is another receiver already registered for the same type, an error is thrown. When a message with the registered type is sent from the other side, it is routed to this particular callback.

Offline storage and display[edit]

Since article pages will be loaded and displayed under the app's full control, it should be trivial to save the HTML and any related CSS styles for later offline display. Saving images for offline use may require a little jumping through hoops but is not impossible.

Likewise, detecting that a page navigation has failed and either falling back to offline content or showing an error message should be easier than when screen-scraping.

There has been some talk of having offline-saved pages update automatically -- this could be done in background using sync accounts on Android, background refresh on iOS 7.

Interactive extensions[edit]

Extensions that currently render static content viewable in VisualEditor will probably work, but we should investigate this a bit more.

Those that involve interactivity like adding search forms or expandable JavaScript category lists might be trickier to deal with. Check this out...

Future-proofing[edit]

  • cleaner screen scraping when needed?
    • if we do need to fall back to screen-scraping to support random Special: pages and other funky features that don't have a native implementation, an isolated WebView is much easier to do this in and make it work than the old PhoneGap-style app.
    • dealing with authentication and adjustment of mobile UI chrome are the main open questions here
  • using Parsoid output means we can switch from reading to visual-editor mode more quickly in the future
    • "in theory" we could ship the VE JS code with the app and provide offline editing. This is not a goal for this quarter!