Page Content Service

The Page Content Service (PCS) is a set of node-based services in Wikimedia production designed to deliver Wikimedia project page content and metadata for modern reading clients. It delivers:


 * 1) Optimized page content for modern clients to provide a highly polished full article reading experience
 * 2) A standard structured representation for pages that can be used for display within lists and previews
 * 3) Additional metadata about a page that can be used for navigation purposes, business logic, or constructing ancillary views in native code (like the table of contents)
 * 4) Aggregated common CSS used for styling articles
 * 5) Business logic as JS that clients can execute locally

Some additional features are:
 * Consolidates client logic for manipulating and styling page content on the server and executes it reducing code maintenance and technical debt for clients
 * Consolidates data from disparate services into a single purpose-built service for displaying page content

The PCS delivers content in both HTML and JSON formats. It consolidates data from the Wikipedia, Commons and Wikidata MediaWiki APIs as well as the Parsoid and ORES API.

The service will supersede the mobile-sections endpoint of the Mobile Content Service (MCS). Currently, the PCS services code is part of the MCS Git repo. Eventually those will be separated so they can be deployed separately from MCS.

These services are maintained by the Wikimedia Reading Infrastructure team.

Mobile HTML
Same as above but optimized DOM that is purpose built for delivering "just the content" and with additional style modifications used in modern experiences. This includes the addition content like the page title, lead image, wikidata description, etc…

It also removes the references section(s) from the end of articles, which can be retrieved separately using the References API.

This API is designed to be used with the JSON endpoints below to build a modern client experience.

Examples: Prod | Beta cluster | Labs | Local RB | Local MCS

Summary
The Summary serves two very important purposes: To accomplish number 1, it contains some basic metadata: an image/thumbnail, a description, the first paragraph of the page plain text and HTML form (`extract` and `extract_html`), and article language and directionality (RTL or LTR). It's preferable to use the `extract_html` over `extract` since some complex formulas are better handled with HTML than plain text.
 * 1) It provides the data necessary for the representation of a page within a page/link preview, search results, other lists, etc…
 * 2) It provides basic metadata necessary for clients to make business logic and navigation decisions before displaying a page.

To accomplish number 2, it contains some semantic information on the page, it's name space, and various URLs in order for clients to understand the content of the page prior to deciding how to display it.

Additionally, the Summary structure is provided in other APIs (like the feed) that return lists of pages.

Page_Previews/API_Specification

Examples: Prod | Beta cluster | Labs | Local RB | Local MCS

For comparison, here is the   request this endpoint replaces: Prod. In the current version `TextExtract` is not used anymore, though. Instead PCS gets more of the informations from the respective Parsoid HTML output and does some transformations on that.

Metadata
The Metadata API returns additional metadata needed for updating the chrome around a page, like the edit icon, and for displaying ancillary views like the table of contents and "other languages" that the page is available in.

Examples: Prod | Beta cluster | Labs | Local RB | Local MCS

Media
Lists media items shown on a page: images, videos, and audio along with licensing information. This is useful for clients wishing to build a gallery interface for content within a page.

Examples: Prod | Beta cluster | Labs | Local RB | Local MCS

More details

mobile-html-offline-resources
Probably we'll also have a mobile-html-offline-resources endpoint listing files the apps would need to download for the offline case. See T217349.

CSS endpoints
Starting task: phab:T188919

Examples:
 * base: Prod | Beta cluster | Labs | Local RB | Local MCS
 * site: Prod | Beta cluster | Labs | Local RB | Local MCS
 * pagelib: Prod | Beta cluster | Labs | Local RB | Local MCS

JS endpoints
Examples:
 * pagelib: Prod | Beta cluster | Labs | Local RB | Local MCS

Clients
The PCS can be used by any WMF or 3rd party client that wants to display page content for reading contexts.

Within the WMF, the following clients are expected to integrate use of the PCS in 2019:
 * Wikipedia iOS App
 * WIkipedia Android App