Requests for comment/Minimalistic MW API Client Lib Specification

From MediaWiki.org
Jump to navigation Jump to search
Request for comment (RFC)
Minimalistic MW API Client Lib Specification
Component General
Creation date 2015-06-02
Author(s) Yurik
Document status in draft
General2015-06-02Yurik

Intro[edit]

We had a number of good discussions about the "minimal API lib", especially related to refactoring pywikibot framework. Let me propose a minimalistic interface for discussion. It should be useful by itself to those who wish to call API directly, while also being convenient for bigger frameworks such as pywikibot. This is by no means final, and simply establishes the discussion base. Most of this spec has been modelled on the API client lib I wrote for the Zero team.

Expectations[edit]

  • The lib will be used as the communication layer with all MediaWiki installations
  • The lib can be easily reviewed by the server-side API developers
  • The lib will allow all complex decisions (e.g. repeats) to be made by overridable callbacks
  • The lib will encourage efficient API usage
  • The lib must be minimalistic, relying on the minimum of the 3rd party components, thus easily vetted and usable in a secure environment
  • Python lib: Support for Python 2 & 3

InfoObject[edit]

The InfoObject is used to report any errors and warnings as returned from the API. API Lib may also generate these objects for debugging/ It contains:

  • level - a comparable type (enum) with fatal/error/warn/info/debug.
  • __str__ - a way to convert the object to string
  • code - error code string as given by MW API
  • message - raw string as given by MW API
  • params - (might be None) an array of values in case API error/warning is formatted on the client (Future)
  • lang - (might be None) language code of the message (Future)

Logging[edit]

A generic logging interface, e.g.

    def __call__(self, infoObject): ...

The lib will provide a simple ConsoleLog() object to output everything to console.

Errors[edit]

  • All api-result errors are raised as ApiError objects that contain all InfoObject fields. Exceptions from the lower levels, such as IO errors, are not wrapped, and will need to be handled separately.

Warnings[edit]

All warnings will be reported to the logging interface as InfoObjects.

Site[edit]

An object that represents one api.php endpoint, wrapping `requests` lib session, with these properties:

  • url: Full url to site's api.php
  • session: current request.session object (allows session sharing between objects)
  • log: an object that will be used for logging. ConsoleLog is created by default
  • user-agent string
  • extra headers
  • tokens
  • callbacks

request(method, forceSSL, headers, **request_kw)[edit]

  • Low-level request method
  • Injects required headers
  • Throws low-level call errors (e.g. non 200 response)

call(action, **kwargs)[edit]

  • The main api-calling function
  • Converts string[] => "str|str|str", bool => 1|0, DateTime => "timestamp"
  • Auto-adds format=json, formatversion=2
  • Auto-uses POST if action=login|edit
  • Auto-uses SSL if action=login
  • raises ApiError if error in data
  • logs warnings
  • returns parsed data (simple JSON conversion, not intelligent field parsing like timestamps->DateTime)

login(user, password)[edit]

  • gets session cookies for the given user
  • throws error on failure
  • auto-gets token if needed

query(**kwargs)[edit]

  • Forces 'continue' param (error if rawcontinue is used)
  • Yields the result['query']
  • Performs proper continuation
  • TBD!!! On error or warning, query() calls an overridable callback - if query should be continued or not. If not, the error or warning is thrown as ApiError instead of yielding any results. Callback should be able to override parameters of the initial request, perform additional requests (e.g. to login/get cookies), pause execution (by simply not returning right away), what else? The default behaviour on error is to login/get cookie if needed, and never to repeat. On warning the warning is logged. TBD if this should be somehow integrated into the call() instead of query().

queryPages(**kwargs)[edit]

  • Uses query()
  • Assembles all requested props for each page
  • Yields one page object at a time - a dict with filled-out properties
  • Attempts to detect "page has changed since the beginning of the iteration" condition, and throws an error once the iteration is complete

token(type)[edit]

  • Returns a token of a given type (csrf by default). Could be cached.

Utilities[edit]