User:DKinzler (WMF)/API Life Cycle

From mediawiki.org
This is a personal brain dump and not aligned with anyone

This document provides guidelines for the URL structure of MediaWiki and Wikimedia REST APIs, with a special focus on compatibility and life cycle management.

URL Structure[edit]

Wikimedia REST endpoints can be either per-domain, or central. Contral endpoints are access on api.wikimedia.org. All central endpoints are Wikimedia specific. Per-domain endpoints may be Wikimedia specific or portable (that is, defined by a component that can easily be installed and run by a third party).

All public REST endpoints should follow one of two URL patterns:

  • https://{gateway}.wikimedia.org/{component}/{version}/... (central)
  • https://{domain}/{gateway}/{component}/{version}/... (per domain)

[TBD: should we distinguish URLs for portable from non-portable APIs somehow?]

The {gateway} component identifies the gateway to be used to access the API. Most APIs would be exposed through all gateways, but some may only be available through specific gateways. The generic gateway name is "api", as in api.wikimedia.org and https://en.wikipedia.org/api/. There may be other gateways, e.g. an "enterprise-api" gateway reserved for high volume access requiring special API keys. Note that gateway names must end in "api", and component names must not end in "api", to avoid confusion and conflits on the api.wikimedia.org domain .

The {component} part of the endpoint URL ideally identifies the domain of the API. However, it is more important that reflect organizational reality: all endpoints under a given component should be owned by a single team, just like all code in a software component should be owned by a single team. If two teams are running endpoints that are conceptually similar, they should still use different {component} prefixes. This may even mean using the team name as the component prefix.

The reason for insisting that component prefixes align with team boundaries is that components are units of versioning, and teams must have autonomy over the life cycle of endpoints they run.

MediaWiki extensions and stand-alone backend services should generally have a separate {component} prefix. They may share a prefix with another service or extension only if both are maintained by the same team.

If we end up having many ugly sounding, non-obvious component names, this indicates that something is wrong with the team structure. In that case, both the teams and the APIs should be re-structured, using the mechanisms described in the sections on Versioning and Deprecation.

Note that central endpoints should not include a {domain} parameter in the path. If the resourced accessed by an API is specific to a site, it should be accessed using the domain of that site, not using an endpoint on api.wikimedia.org. This ensure that we consistently apply per-wiki user group memberships and rate limits.

Self-Documentation[edit]

Each API should expose an OpenAPI specification at its base URL:

  • https://api.wikimedia.org/{component}/{version} (central)
  • https://{domain}/api/{component}/{version} (per domain)

TBD: what else can we say about this?

Versioning[edit]

The {version} of the URL of a part of a public stable API starts with a "v" followed by a single number indicating the version of the API, e.g. v2. This allows APIs to be restructured while avoiding breaking changes and confusion.

Other prefixes such as rc or x or r may be used for unstable endpoints and release preparation. All endpoints using the v prefix are subject to the deprecation policy. Endpoints using a prefix other than v are encouraged to use the same versioning mechanism, but do not have to.

The endpoints of a public stable API may evolve in a backwards compatible way without changing the version number. This includes adding endpoints. Such backwards compatible changes do not need to be tracked using a minor version number as is best practice for libraries, because APIs differ from libraries in that the code using them cannnot requrie a specific version. The client has no power over which API version is used, it has to hanlde whatever is offered by the site.

However, it is still useful to track changes in the REST endpoints over software releases. For this purpose, the API version presented in the OpenAPI specification (in the info.version field) should include a minor version that is incremented if the spec changed since the last release of the software. For endpoints exposed by MediaWiki, and by MediaWiki extensions that use snapshot releases synchronized with MediaWiki releases, the minor version of the API should be derived from the MediaWiki version. For instance, an API at users/v3 released with MediaWiki 1.40.1 would have version 3.40.1 indicated in its OpenAPI spec, the feeds/v2 API on a WMF site running the MediaWiki 1.41.0-wmf.26 branch would be version 2.41.0-wmf.26.

Versioning the API specification along with software releases provides useful information to authors of client libraries such as pywikibot that are designed to work with a variety of wiki sites running different versions of MediaWiki. It allows them to determin which endpoints they can use to be compatible with the range of MediaWiki versions that they want to target.

Deprecation[edit]

After releasing a new major version of an API, we may want to deprecate the old version, to reduce the maintenance and administration burden. There are several aspects to consider:

Clients should be notified of the deprecation, ideally in advance. This should be done using an appropriate HTTP header. [TBD: How exactly is still not quite clear. Options include]:

Deprecated endpoints should remain functional for some time after deprecation. Ideally, the old endpoint should issue a redirect (with status code 308) to the new equivalent endpoint. This is particularly useful when the endpoint was merely migrated to another component (more on that below). The redirect response must contain a header indicating deprecation, and should contain additional information about the derpecation in the response body.

If it is not possible to construct a redirect that will produce semantically equivalent behavior, the old handler code should be kept in place. If code is shared with the new handler, care must be taken to ensure that changes to that shared code do not impact the behavior of the old endpoint. The best way to achieve this is to have a comprehensive test suite in place that ensures that the deprecated endpoint is still behaving to spec.

A special case of deprecation is the migration of an endpoint. In that case, the deprecation is not due to the release of a new major version, but to the relocation of the endpoint to a new URL, without a change in behavior. This would be the case when re-drawing component boundaries (e.g. when teams are being restructured).

Removal[edit]

Once an endpoint has been deprecated for at least 6 months in production (and, for portable endpoints, one major release of MediaWiki), it should be removed as soon as possible. Once the endpoint has been removed, attempts to access it should result in a 404 response. The response body should contain useful information about which endpoint should be used instead.

However, a best effort should be made to identify and contact remaining usersof the API. In oder to do this, metrics need to be collected on how much the endpoint is still being used, and we should look at the User Agent headers sent by the remaining clients to get information on who and what is still using the endpoint most, and how we might contact them.

One approach to draw attention to the iminent removal of an end point is the "flickering lights" approach: If the endpoint starts returning 404 for one minute at the start of every hour, users will notice and investigate. This will hopefully lead to them discovering information about the deprecation and replacement of the endpoint.

Communication[edit]

wikitech-l....

Unstable Endpoints[edit]

The are several reasons for which we may need public endpoints that are not considered stable to use.

Note: the terms "private" and "internal" should be avoided in this context, since they are ambiguous: both may be used to refer to endpoints that are not publically accessible, or the endpoints that are publically accessible, but are not stable and should not be used by third parties.

For the purpose of this document, API endpoints that are not accessible from outside our network are are considered local endpoints. Endpoints that are public but not stable for third party use are considered reserved.

The following cases of non-stable public endpoints need to be considered:

  • reserved endpoints, used by client side code controlled by Wikimedia. Such endpoints can be versioned if there is a need to retain compatibility with older versions of the client side code. E.g. if the endpoint is used by a mobile app. Versioning of reserved endpoints should follow the same pattern as the versioning of public endpoints, except that the version number is prefixed by r rather than v. So a path for a reserved endpoint may look something like this: /mobile/r2/summary/{title}
  • we may want to offer experimental endpoints for third parties to try out and provide feedback on. Experimental endpoints would use the x prefix to the version number. The version number itself should be in the same as the corresponding version of the public stable API. So an experimental API for a component currently at v2 would be x2, or it could be x3 if the experiment relates to a future v3 of the API that is currently under development.
  • Another reason to publish an unstable API is providing pre-releases access for beta testing. A release candidate for a new major version of the API can use the rc prefix, so for a component currently at v2 it would be rc3.
  • We also need to defiend paths for endpoints currently under development. They should not be exposed in a production environment, but would be defined localls and publically exposed on the beta cluster and perhaps even on test.wikipedia.org. Such endpoints should use the dev prefix. If we are looking to add a stats endpoint to version 3 of the users API,it would be defined as /users/dev3/stats.

Compatibility[edit]

adding parameters

adding response body fields

removing response body fields

removing parameter support

changing allowed values

adding required request body fields


bug fixes are allowed!

Migration[edit]

Per the writing of this document in early 2023, Wikimedia sites are exposing public REST endpoints in three ways:

  • as https://{domain}/w/rest.php/... for endpoints defined by MediaWiki core or extensions
  • as https://{domain}/api/rest_v1/... for endpoints served from RESTbase
  • as https://api.wikimedia.org/... for endpoints using the new API gateway

Most of these endpoints should be migrated to using the URL structure defined in this document.

Migrating these endpoints should follow the process described for deprecation in general. In particular, the old URLs should be responsing with redirects (status 308) to the new location of the endpoints.


Aka relocation

Routing Architecture[edit]

Routing in LB

  • handling of /rest.php
  • handling of /api/
  • redirects?
  • third party wikis?

Routing in Gateway

  • routing of per-domain apis
  • routing to rest.php?
  • routing based on major version
  • different routing for individual endpoints?
  • migration redirects?

MW

  • version routing + checking
  • migration redirects

node

  • version routing + checking
  • migration redirects