API:Etiquette

From mediawiki.org

This page contains the best practices that should be followed when using the API.

Behavior

Request limit

There is no hard speed limit on read requests, but be considerate and try not to take a site down. Most system administrators reserve the right to unceremoniously block you if you do endanger the stability of their site.

Making your requests in series rather than in parallel, by waiting for one request to finish before sending a new request, should result in a safe request rate. It is also recommended that you ask for multiple items in one request by:

  • Using the pipe character (|) whenever possible e.g. titles=PageA|PageB|PageC, instead of making a new request for each title.
  • Using a generator instead of making a request for each result from another request.
  • Use GZip compression when making API calls by setting Accept-Encoding: gzip to reduce bandwidth usage.

Requests which make edits, modify state or otherwise are not read-only requests, are subject to rate limiting. The exact rate limit being applied might depend on the type of action, your user rights and the configuration of the website you are making the request to. The limits that apply to you can be determined by accessing the action=query&meta=userinfo&uiprop=ratelimits API endpoint.



When you hit the request rate limit you will receive a API error response with the error code ratelimited. When you encounter this error, you may retry that request, however you should increase the time between subsequent requests. A common strategy for this is Exponential backoff.

Parsing of revisions

While it is possible to query for results from a specific revision number using the revid parameter, this is an expensive operation for the servers. To retrieve a specific revision use the oldid parameter. For example:


The maxlag parameter

If your task is not interactive, i.e. a user is not waiting for the result, you should use the maxlag parameter. The value of the maxlag parameter should be an integer number of seconds. For example:


This will prevent your task from running when the load on the servers is high. Higher values mean more aggressive behavior, lower values are nicer.

See Manual:Maxlag parameter for more details.

The User-Agent header

It is best practice to set a descriptive User Agent header. To do so, use User-Agent: clientname/version (contact information e.g. username, email) framework/version.... For example in PHP:

ini_set('user_agent', 'MyCoolTool/1.1 (https://example.org/MyCoolTool/; MyCoolTool@example.org) UsedBaseLibrary/1.4');

Do not simply copy the user-agent of a popular web browser. This ensures that if a problem does arise it is easy to track down where it originates.

If you are calling the API from browser-based JavaScript, you may not be able to influence the User-Agent header, depending on the browser. To work around this, use the Api-User-Agent header.

See m:User-Agent_policy for more details.

Data formats

All new API users should use JSON . See API:Data formats for more details.

Performance

Downloading data in bulk is not always extremely efficient using the Action API. On Wikimedia wikis, there are faster ways to get data in bulk, see m:Research:Data and wikitech:Portal:Data Services for more details.

Other notes

If your requests obtain data that can be cached for a while, you should take steps to cache it, so you don't request the same data over and over again. Some clients may be able to cache data themselves, but for others (particularly JavaScript clients), this is not possible.

Whenever you're reading data from the web service API, you should try to use GET requests if possible, not POST, as the latter are not cacheable.

See also