User:DKinzler (WMF)/Client Software Guidelines

This document provides guidance for client software that is accessing Wikimedia REST APIs.

General Guidance
Norms for maintainers See also:
 * How to find APIs
 * Where to find documentation and specs
 * common data types
 * error formats
 * paging
 * Who is the target audience
 * What happens if I'm not following the guidelines.
 * MUST subscribt to mailing list (api-announce?)
 * MAY make code available for code search


 * REST and HATEOAS.

MUST follow HTTP standards
Clients that interact with you APIs must follow the relevant HTTP standards, most importantly RFC 9110. This can for the most part be achieved by using a good HTTP client library.

More resources:


 * https://developer.mozilla.org/en-US/docs/Web/HTTP/Resources_and_specifications

SHOULD be designed to be robust against changes and failures
Clients could should follow the Robustness Principle: "be conservative in what you do, be liberal in what you accept from others". In practice, this means that failures of the network and of the server should be handled gracefully, and assumptions about the behavior of the server should be kept to a minimum.

See also:

SHOULD minimize the number of requests
Client code should be designed to minimize the number of requests it sends to the server. The simplest way to achieve this is to avoid requesting data that is not actually needed. Beyond that, some APIs may support features that allow the number of requests to be reduced, such as:


 * Batch requests [TBD: reference the corresponding API design guide].
 * Property expansion [TBD: reference the corresponding API design guide].

Another way to reduce the number of requests is to avoid unneccessary redirects by ensuring that the request URL is normalized as much as possible. In particular:


 * Do not include trailing slashes or double slashes in the URL
 * Use the canonical form of resource identifiers in the URL

NOTE: Please take care that you don't end up requesting a lot of unnecessary data in order to avoid requests, for instance by requesting all properties to be always erxpanded, even if their value is not actually needed. See.

SHOULD minimize the amount of data transferred

 * filter unneeded entities
 * exclude fields if not needed
 * avoid expansion if not needed

SHOULD support compression
https://developer.mozilla.org/en-US/docs/Web/HTTP/Headers/Content-Encoding

https://developer.mozilla.org/en-US/docs/Web/HTTP/Compression#end-to-end_compression

MUST specify a meaningful User-Agent header
Clients software interacting with our APIs must follow the Wikimedia User-Agent policy, which requires that clients must send a  header containing information that allows us to identify the software and provides a way to contact the maintainer in case of issues. The header must have the following form:

.

Parts that are not applicable can be omitted.

In case a  header cannot be set (e.g. because the client code is executing in a web browser which sets its own   header), client code should set the   instead.

NOTE: Clients that do not follow this policy are likely to be blocked or severely rate limited without warning, because if you are not providing contact information in the User-Agent string, we don't have a way to warn you.

SHOULD follow redirects

 * ...unless...
 * use correct semantics for 301, 302, 303, and 308, etc

MUST surface user blocks

 * ref to block info document (403 response body)

MUST surface errors and warnings
Inform the user about:


 * server errors (5xx)
 * client errors (4xx) as appropriate

See spec for error document structure!

Inform the developer (and possibly also the user) about:


 * Deprecation headers
 * Sunset headers
 * X-WMF-Warning headers

MUST gracefully handle HTML content when receiving errors
Client code must be prepared to process HTML responses when receiving 4xx or 5xx status codes, even if the address they are requesting data from is specified to return a machine readable format such as JSON.

The reason is that, while API endpoints should be designed to return machiene readable errors descriptions, intermediate layers that process the HTTP request, such as proxies and caches, will often generate HTML bodies when something goes wrong. The client should make an effort to process the HTML response in a meaningful way.

MUST delay retries
Software that is sending requests to our APIs MAY requests upon receiving errors, if the nature of the error indicates that it is likely to be transient. This is typically the case for status code 503, but also plain 500, and particularly 429. Other error codes may be interpreted as transient depending on the context: for instance, attempts to access a newly created resources may temporarily result in a 404 response due to replication lag.

Software that implements retry logic MUST ensure that retries do not happen too quickly and to often. Specifically:


 * If the response contained a Retry-After header, the client MUST wait for the specified time until retrying.
 * If the response body contains more details about the applicable rate limit, the client SHOULD use that information to determine how long it should wait until it makes another request.
 * Otherwise, the client SHOULD apply an exponential backoff strategy, starting with a delay of one second for the first retry, and doubling the wait time for each subsequent retry.
 * If exponential backoff is not implemented and no Retry-After header is received, the client should wait at least five seconds before retrying.
 * Clients MUST NOT retry a request more than ten times.
 * Clients SHOULD make an effort to avoid sending retries from multiple parallel threads or processes independently.

MUST NOT use APIs marked as restricted

 * (or "private", "internal")

MAY authenticate

 * authentication methods
 * csrf tokens
 * SHOULD use OAuth when acting on behalf of others
 * SHOULD auth when editing?

MAY request localization

 * content, errors