Wikimedia Apps/Team/RESTBase services for apps

The Reading engineering team is developing a Node.js mobile content service backed by RESTBase to provide content in a form tailored to the needs of the mobile platforms. The Wikipedia Android app uses this service to get an article's opening section, table of contents, description, lead image URL, and other article information in a single request. Other endpoints are used to fetch content for link previews and term definitions. A set of endpoints to provide content for app feeds is currently in development.


 * Sample URL: https://en.wikipedia.org/api/rest_v1/page/mobile-sections-lead/Albert%20Einstein
 * API documentation: Mobile section of rest_v1 docs
 * Phabricator project: #mobile_content_service

General ideas & goals
The idea for the services mentioned here is to provide a layer of abstraction on top of various MediaWiki action API and existing RESTBase requests, custom-made for consumption by apps. In other words, they provide a Façade which makes it easy for apps to consume content from Wikipedia. The initial main goal is to improve page load performance.

We want to achieve that through the following approaches:
 * Reduce amount of payload by removing unneeded content and using Parsoid.
 * Reduce the need for separate requests by aggregating information from multiple request into fewer requests.
 * Flatten and trim JSON structures. (Again, remove unused data.)


 * Take advantage of Parsoid annotations to improve the quality of the transformations done.
 * Move DOM transformations of page content (currently done client-side) to the server.

Service usage
The service endpoints are used by the Android app. Android app users get to use them by default except for usages of zhwiki or when  is disabled in the app settings. In those two cases it falls back to using regular api.php endpoints, and some newer features which are only implemented for RESTBase users are automatically disabled. In the app developer settings you can check if RESTBase is enabled and change that if necessary.

Routes
All routes start with.

.../page/mobile-sections-remaining/{title}
These three routes are used by the beta Android app when the  developer option is enabled.

The output has a similar JSON structure to the PHP  module, except: The Swagger spec can be found in the source repo in the file. This is a good source to see the actual structure of the output. This spec must be updated when the output structure is changed since there are automated tests which verify that the output adheres to the spec.
 * : has a top-level object with two properties:  and  . This is an endpoint which gets the contents of the next two endpoints in one single request, which is useful for refreshing saved pages.
 * : is used for the initial page load.
 * Instead of the  object to get the URL for the lead image it has a   property under  . This object contains a hashtable of common lead image widths (640, 800, 1024) pointing to respective URL of the lead image in the size in pixel.
 * If the article has a pronunciation the the  object has a   string with the fully qualified URL to the pronunciation file.
 * If the article uses one of the  templates the the   object has a   array with the fully qualified URLs to the parts of a recorded audio version of this article.
 * If there are Geo coordinates associated with the article then the  object will have the   and   of the place.
 * The  array includes the information needed to display the lead section and also to build the table of contents. Therefore, it has the section text of the lead section only and the rest of the sections don't include it.
 * : Note that this route's  array does not include the lead section text since this was already retrieved as part of the lead response.

Examples:

.../page/mobile-text/{title}
This route is meant for a new generation lite app; initially targeted for low-powered, older Android devices. The idea is, instead of using a WebView, to use native Android UI components to show the page contents.

This route is currently just using action=mobileview but it's foreseeable that it'll use similar backend calls as the previous routes.

More at T90758.

Example:

.../page/summary/{title}
This route provides a summary text snippet (generated by the TextExtracts extension), thumbnail info (via PageImages), if a thumbnail is available, and the article language and directionality (RTL or LTR). Used by the apps to generate link previews.

Note: This route is provided directly by RESTBase and is not available in the mobileapps service development branch (i.e., when running the service locally). For now, the legacy /mobile-summary route remains in the repo but is unused.

Example:

.../page/definition/{title}
This route provides a set of definitions pulled from the Wiktionary page from the term. (It does not provide the Wiktionary content in full.)

Currently used in the Wikipedia Beta Android app, where users can view a popup with definitions by highlighting a word in the app and choosing the "define" option from the context menu.

Available for English Wiktionary only; rollout to other languages pending based on user engagement.

Example:  (Wiktionary entry of bar)

.../page/random/{format}
MCS provides the  format. All other formats ( and  ) are provided by RESTBase. See T132597 (Agree on feed endpoints).

This endpoint tries to provide more interesting pages in its result than a straight random MW API query. It prefers pages with a lead image, WD description, and longer text extract.

Examples:

.../feed/featured/{yyyy}/{mm}/{dd}
This endpoint provides an aggregation of feed related microservices for one specific day. Note that year has to be exactly four digits, and month and day have to be two digits. Pad with 0 if needed. Earliest year supported is 2016. Example: 2016/07/01.

Example: Aggregated feed for June 8th, 2016:

The response contains the following properties: While the other feed microservices are implemented in MCS they are not exposed via RESTBase at this time. Some example URIs to just invoke the microservices locally is in the README.md of the source repo.
 * : featured article (enwiki only at this time)
 * : featured image of the day (from Wikimedia Commons)
 * : a more interesting random article
 * : a list of the previous day's top read articles
 * : current news, irrespective of day requested. This item is only available for a few wikis right now: en, de, es, he, pt, ru, zh. Latest list and implementation if you want to help us expand it to more languages.

Route usage
We have a RESTBase dashboard in Grafana which shows request rates for all individual endpoints. You can choose all the endpoints related to mobile on that graph to get the metrics of how many client requests actually hit RESTBase. The requests are split to several categories: However, for external requests this represent only the cache misses while the vast majority of the requests is served by Varnish.
 * internal - the request came from the WMF cluster or Labs
 * internal_update - it’s an update request from Change-Propagation
 * external - the request came from an external user.

There's also a Grafana dashboard specifically for the mobile-sections requests.

Source
The services are in the following Gerrit repos:
 * 1) mediawiki/services/mobileapps
 * 2) mediawiki/services/mobileapps/deploy

The second repo is for deployment purposes. The first repo contains the implementation of the service routes. Both repos are based on the templates provided by our services team.

Development on local machine
The README.md file in the repo has some great pointers on how to set up and use the service on a dev machine.

MW Vagrant
Enable the  role in MW Vagrant. The code is located under. To restart just the service without having to restart the whole Vagrant instance you can run: Since the Vagrant instance is self-contained you cannot access other servers. If you have a page called Foo in your Vagrant instance you can access it via the following command after sshing into the box: The log file is.

Deployment on labs machine
T91794 Deploy experimental version of mobile apps content service

The service on appservice.wmflabs.org is updated and restarted automatically a few minutes after code gets merged. Here is a simple example for some endpoints: Troubleshooting on labs machine:
 * http://appservice.wmflabs.org/en.wikipedia.org/v1/page/mobile-sections/Dog
 * http://appservice.wmflabs.org/en.wikipedia.org/v1/page/mobile-sections-lead/Dog
 * http://appservice.wmflabs.org/en.wikipedia.org/v1/page/mobile-sections-remaining/Dog
 * http://appservice.wmflabs.org/en.wiktionary.org/v1/page/definition/lemon
 * Restart the service:
 * view logs:

FYI: Beta cluster
This is more an FYI since this is for the RESTBase framework itself. There is a beta instance on deployment-restbase0[12].deployment-prep.eqiad.wmflabs. It uses the labs instance, appservice.wmflabs.org, to complete the mobile route requests.

Some example requests:
 * https://en.wikipedia.beta.wmflabs.org/api/rest_v1/page/mobile-sections/Dog (The app may use it to refresh saved pages)
 * https://en.wikipedia.beta.wmflabs.org/api/rest_v1/page/mobile-sections-lead/Dog (Initial request of page content)
 * https://en.wikipedia.beta.wmflabs.org/api/rest_v1/page/mobile-sections-remaining/Dog (Remainder of page content)
 * https://en.wikipedia.beta.wmflabs.org/api/rest_v1/page/summary/electron (not available on beta labs since it is provided by RESTBase, not by Mobile Content Service)
 * https://en.wiktionary.beta.wmflabs.org/api/rest_v1/page/definition/bear_witness (For Wiktionary domains only!)

Deployment on Production cluster
Some example requests:
 * https://en.wikipedia.org/api/rest_v1/page/mobile-sections/Dog (The app may use it to refresh saved pages)
 * https://en.wikipedia.org/api/rest_v1/page/mobile-sections-lead/Dog (Initial request of page content)
 * https://en.wikipedia.org/api/rest_v1/page/mobile-sections-remaining/Dog (Remainder of page content)
 * https://en.wikipedia.org/api/rest_v1/page/summary/electron (provided by RESTBase, not by Mobile Content Service)
 * https://en.wiktionary.org/api/rest_v1/page/definition/lemon (For Wiktionary domains only!)

Setup notes
The service is deployed on Service Cluster B:.

Deployment process
A description of the deployment process we follow.

Deployment schedule
Deployment calendar

Deployment log (deprecated)