Core Platform Team/Initiatives/Core REST API in MediaWiki

From mediawiki.org

Initiative Description

< Initiatives

Summary

Base functionality of MediaWiki should include:

  • Article CRUD (Create, Read, Update, Delete)
  • Article history
  • Article search
  • Format transformation (wikitext, HTML, -> PDF)
  • User settings Read, Update
  • User contribution history
  • Sitewide history (recent changes)
  • Media CRUD
  • Media history
  • Article metadata (for example links, language links, inbound links, references)
  • Advanced article curation features (for example protect, undelete, patrol, ...)
  • Advanced user administration features (for example block, ban, manage groups, ...)

One aspect of this project is defining the core API endpoints in an RFC for review.

Significance and Motivation

REST is the primary current API paradigm which many developers are used to, and which is supported by many libraries. Its advantages include a tight fit with HTTP methods and a good fit with HTTP caching.

Outcomes
  • Core functionality of MediaWiki available through a RESTful API interface
Baseline Metrics

About 0% of Action API functionality is covered by a RESTful API within MediaWiki.

Target Metrics
  • 80% of most popular Action API functionality is available through a RESTful API.
  • Sufficient features are available to build a "simple" Wiki reader/editor.
Stakeholders
  • Readers Infrastructure
  • Partnerships
  • Third-party developers
Known Dependencies/Blockers
  • REST Router infrastructure (implemented for Parsoid)
  • OAuth 2.0

Epics, User Stories, and Requirements

Personae

Note that these personae don’t map exactly to user groups or roles within MediaWiki.

  • User - Any person with a registered account in the project
  • Reader - A person reading content on the project
  • Contributor - A person who adds new information to the project
  • Curator - A person who edits and organizes existing information in the project
  • Moderator - A person who manages users, groups, roles, and monitors bad behaviour

User stories

Epic 0.5: History for iOS Client

The iOS client is going to include some history management UI, and we'd like to support that with the REST API in MW. So these history use cases, previously in Epic 3, are hoisted earlier to meet the iOS team's release targets.

ID Title Description Priority Notes
1 Page history As a Curator, I want to get a list of the previous versions of a page, so that I can understand how it developed over time. Must Have Paginated history, with revision summaries. iOS asks for "filtered history". See 9, 10, 11 below.
2 Read a version As a Curator, I want to get an older version of a page, so that I can see which parts of the page were added or removed. Must Have
3 Compare versions As a Curator, I want to see the difference between one version of a page and another version, so I can see when parts of the page were added or removed. Must Have
4 Edit count As a Curator, I want to get a count of all edits of a page, so that I can understand the maturity of the content. Must

Have

4, 5, 6, 7, 8 should probably be bundled in a single API call, since they'll be shown on the same screen.
5 Editor count As a Curator, I want to get a count of the unique editors of a page, so that I can understand the diversity of contribution to the page. Must

Have

6 Reverted edit count As a Curator, I want to get a count of reverted edits of a page, so that I can understand the amount of vandalism to the page. Must

Have

7 Anonymous edit count As a Curator, I want to get a count of edits to a page by unauthenticated contributors, so I can understand the reliability of the page content. Must

Have

"Reliability" here is probably an unfair characterization of anonymous edits. Other ideas?
8 Bot edit count As a Curator, I want to get a count of edits to a page by bots, so I can understand the level of automated content in the page. Must

Have

9 Reverted edit history As a Curator, I want to get a list of the edits to a page that have been reverted, so that I can understand what kind of vandalism has happened to the page. Must

Have

10 Anonymous edit history As a Curator, I want to get a list of the anonymous edits to a page, so that I can ... Must

Have

?
11 Bot edit history As a Curator, I want to get a list of the bot edits to a page, so I can understand how bots have changed the page over time. Must

Have

Epic 1: Minimal client

At the end of this epic, the API should be sufficiently functional to support the needs of a minimal Web or mobile wiki client.

ID Title Description Priority Notes
13 Read a page offline As a Reader, I want to get a page and its contents, so that I can read it whenever I want. Must have This is the most basic use case, retrieving the metadata about the page and the page text in HTML form in a single request. Replaced use case 1.
11 Read page online As a Reader, I want to get a page online, so that I can read it with my browser or HTML widget and it will load fast. Must have Downloading a large document encoded in JSON, then loading the HTML from the JSON into a browser or native HTML widget, is much less efficient that letting the browser or widget download the HTML itself. So, if the user is "online", we want to have two endpoints: one for the JSON representation of the page without HTML, and one for HTML only.
2 Create a page As a Contributor, I want to create a new page, so that I can add information to the project. Must have
12 Get page source As a Contributor, I want to get the source code for a page, so that I can edit it locally. Must have For a contributor, it's important to get the wikitext source for a page, which can be edited and then updated (user story 3). Because this representation is so similar to the representation for the Create and Update stories, it makes sense to keep this at the GET /page/{title} endpoint.
3 Update a page As a Contributor, I want to update a page, so that I can include more information or restructure the content. Must have
6 Page search As a Reader, I want to get a list of pages that match a search term, so that I can find pages about a topic I’m interested in. Must have
7 Media links As a Reader, I want to get a list of media files embedded in a page, so that I can view, read or listen to them. Must have
8 Language links As a Reader, I want to get a list of alternate language versions of a page, so that I can switch to another language version. Must have
9 Read a file As a Reader, I want to get the current version of a media file, so I can read, view or listen to it. Must have
10 OAuth 1.0 As a User, I want to provide OAuth 1.0 credentials, so that I get credit for my work. Must have

Epic 1.5: Search enhancement

These are enhancements to the search endpoint requested by the Desktop Web team to support a JavaScript-based search page.

ID Title Description Priority Notes
1 Thumbnail image in search results As a Reader, I want to see a thumbnail image of each page in a search result set, so I can visually identify the topic of the page. Must have Prototypes for new desktop search include thumbnails.
2 Page description in search results As a Reader, I want to read a description of each page in a search result set, so I can quickly evaluate if the page is relevant to my search topic. Must have Prototypes for new desktop search include description.
3 Briefly cacheable search results As a Reader, I want search results to be cacheable on a very short time span, so if I make a typo and correct it my previous search results are retrieved much faster. Must have This is for typeahead search typos. So if I type "Washingt" I get search results for that word, and if I type "p" next, I'll get very few results for "Washingtp", and typing backspace will initiate a search for "Washingt" which will be cached. Apparently a big hassle for end users when it does too many searches. Cache window should be somewhere between search index window and the time to identify and correct a typo (<60s, maybe much less).
4 Search results list size As a Client Developer, I want to specify the number of pages to return in the search results, so that I have just the right number of results for my user interface. Must have Passing along a segment size parameter. I don't think we need multiple segments per search; it's also hard to manage that with relevance-based searches.
5 Detect latinized scripts As a Reader, I want to search using a Latinized transliteration of my native script, so that I don't have to swap my device's character set to search for pages. Optional On Hebrew and Russian wikis, the DWIM gadget will detect if a Latin script string has results under a given threshold, and will do a second search with transliterated characters if so. Desktop Web team wants this enabled at the API level instead of in the client, to save an HTTP hit.

Epic 2: Media management

At the end of this epic, the API should be sufficient to handle basic media management tasks.

ID Title Description Priority Notes
2 Create a file As a Contributor, I want to upload a new file, so I can contribute to the project. Must have
3 Update a file As a Contributor, I want to upload a new version of an existing media file, so I can improve it. Must have
6 File search As a Contributor, I want to get a list of media files that match a search term, so that I can find media to add to a page. Must have Maybe prefix-search for type-ahead, or full-text

Epic 3: History

This epic will give us the tools necessary to handle historical versions of pages and files in the wiki.

ID Title Description Priority Notes
3 Read an edit As a Curator, I want to see the difference between one version of a page and the previous version, so I can see when parts of the page were added or removed. Must Have
4 Delete a version As a Curator, I want to delete an older version of a page, so that inappropriate content isn’t available to readers. Optional
5 Rollback to a version As a Curator, I want to roll back to a previous version of a page, so that the best known version of a page is always the current one. Must Have
6 User contributions As a Moderator, I want to get a list of versions of pages a user has created or updated, to judge the user’s intentions and ability. Must Have
7 Recent changes As a Curator, I want to get a list of all changes to pages in the project, so I can review the current versions of pages in the project. Must Have Dupe of 8
8 Recent changes As a Moderator, I want to get a list of all changes to pages in the project, so I can detect if there has been any bad behaviour lately. Must Have Dupe of 7
9 File history As a Curator, I want to get a list of the previous versions of a media file, so that I can understand how it developed over time. Must Have
10 Read a file version As a Curator, I want to get an older version of a media file, so that I can see which parts of the page were added or removed. Must Have
11 Delete a file version As a Curator, I want to delete an older version of a media file, so that inappropriate content isn’t available to readers. Optional
12 Roll back a file As a Curator, I want to roll back to a previous version of a media file, so that the best known version of a file is always the current one. Must Have

Epic 4: Content management

At the end of this epic, we should have enough functionality to support the main efforts of curators.

ID Title Description Priority Notes
6 Delete a page As a Curator, I want to delete a page, to keep the project focused. Optional
7 Rename a page As a Curator, I want to rename a page, to resolve name conflicts or to make the page easier to find. Optional
1 What links here As a Curator, I want to get a list of pages that link to a page, so I can see how they refer to the page, or change their links if I am going to delete the page. Must Have
2 Protect a page As a Moderator, I want to protect a page, to keep untrusted users from modifying the page. Must Have
3 Protect a file As a Moderator, I want to protect a media file, to keep untrusted users from modifying the file. Must Have
4 Undelete a page As a Curator, I want to undelete a page, if the page was deleted by mistake. Must Have
5 Patrol a page As a Moderator, I want to mark a version of a page as patrolled, to let other Moderators know that they don’t have to review that version of the page. Must Have
8 Rename a file As a Curator, I want to rename a media file, so it is easier to find or so the name is more descriptive. Optional
9 Delete a file As a Curator, I want to delete a media file, so that the project stays focused. Optional

Epic 5: User management

At the end of this epic, we should have the basic functionality that admins use for managing contributors to a wiki project.

ID Title Description Priority Notes
1 Get user groups As a Moderator, I want to get a list of groups that a user is in, to understand their current level of access. Must Have
2 Add user to group As a Moderator, I want to add a user to a group, to give them extra access. Must Have
3 Remove user from group As a Moderator, I want to remove a user from a group, to revoke their access. Must Have
4 Block user As a Moderator, I want to block a user from making any changes to the project, so that they don’t cause any damage to the content or community. Must Have
5 Block IP As a Moderator, I want to block an IP address or IP subnet from making any changes to the project, so that they don’t cause any damage to the content or community. Must Have
9 Get user settings As a User, I want to get my user settings, so I can see how my account is configured. Optional Could be a single setting or a bundle of all settings
10 Save user settings As a User, I want to update my settings, so that I can change my experience. Optional Could be a single setting or a bundle of all settings

Non-functional requirements

  • Secure HTTP-based
  • Positional arguments (e.g. /article/12331/history)
  • OAuth 2.0 authentication with API keys
  • Cache friendly (Last-Modified, Expires, Etag, If-Modified-Since, If-No-Match)
  • Rate-limiting
    • Global unauthenticated
    • Per API key, unauthenticated
    • Per API key/authenticated user
  • Explicit versioning
  • Cross-wiki API calls

< Initiatives


Documentation Links

< Initiatives

Phabricator

https://phabricator.wikimedia.org/T229661

Plans/RFCs

None given

Other Documents

Subpages


See also