API:Query

Query API module allows applications to get needed pieces of data from the MediaWiki databases, and is loosely based on the Query API interface currently available on all MediaWiki servers. All data modifications will first have to acquire a token (prop=info&intoken=..., or prop=revisions&rvtoken=...) to prevent abuse from malicious sites.

There are several types of queries API understands (all may be combined in one request):
 * Meta information about the whole site and the logged in user
 * Revisions gets data about page content and revision history
 * Page information about any data contained in a page text itself, such as list of links, templates, categories, interwikies.
 * Listings of pages that match request parameters: all pages, pages in a category.

Title Normalization (done)

 * Converts improper page titles to their proper form. Capitalizes first character, replaces '_' with ' ', changes canonical namespace names to their localized alternatives, etc.

Redirects (done)

 * Redirects can be resolved by the server, so that the target of redirect is returned instead of the given title. This example is not very useful without additional prop=... element, but shows the usage of redirect function. The 'redirects' section will contain the target of redirect and non-zero namespace code. Both normalization and redirection may take place. In case of redirect to a redirect, all redirections will be solved, and in case of a circular redirection, there might not be a page in the 'pages' section.

Circular Redirects (done)

 * Assume Page1 &rarr; Page2 &rarr; Page3 &rarr; Page1 (circular redirect). Also, in this example a non-normalized name 'page1' is used.

Request: api.php ? action=query & titles=page1 & redirects Result: api: query: redirects: Page1: Page2     Redirects are present, but not the 'pages' element. Page2: Page3 Page3: Page1 normalized: page1: Page1

Limits

 * To prevent server overloads, each query imposes a limit on how many items it can process. Anonymous and logged-in users have one limit, while bots and sysops have a considerably higher limit as they are trusted by the community. At present, each query simply lists the maximum request size it allows. For example, allpages list will allow aplimit= to be set no higher than 500, or in case of a bot or a sysop - no higher than 5000.
 * Drawbacks: Currently all limits are additive, so if the user requests allpages and backlinks, the user will get 500 of each. This is not very good, as the more items are compounded into one request, the heavier the load on the server will be. Instead, some sort of a weighted mechanism should be developed, where each request item has a certain "cost" associated with it, and each user is allocated a fixed allowance per request. The more information user requests, the less the limit becomes for that request. Unfortunately, that makes it very hard to figure out the maximum limits before executing the query, so might not be a workable solution.

Query - Generators (done)
Generator is way to use one of the above instead of the titles= parameter. The output of the list must be a list of pages, whose titles get automatically used instead of the titles=/revids=/pageids= parameters. Other queries such as content, revisions, etc, will treat those pages as if they were provided by the user in the titles= parameter. Only one generator is allowed, and while it is possible to have both generator= and list= parameters in the same call, they may not contain the same values.

Using allpages as generator
Use the allpages list as a generator, to get the links and categories for all titles returned by allpages. Request: api.php ? action=query & generator=allpages & apnamespace=3 & aplimit=10 & apfrom=A & prop=links|categories Result: api: query: pages: Template:A-Article: id: 12341 ns: 10 links: Linked Article1:           Linked Article1 is in the main namespace Talk:Linked Article2:      For non-main ns, list it as a sub-element ns: 1 ...          categories: Category:Cat1: Category:Cat2: ...        Template:B-Article: ...        Template:C-Article: ...    query-status: allpages: continue: apfrom=D-Article     The next item in this list would have been Template:D-Article.

Generators and redirects
Here, we use "links" page property as a generator. This query will get all the links from all the pages that are linked from Title. For this example, assume that Title has links to TitleA and TitleB. TitleB is a redirect to TitleC. TitleA links to TitleA1, TitleA2, TitleA3; and TitleC links to TitleC1 & TitleC2. Redirect is solved because of the "redirects" parameter.


 * The query will execute the following steps:
 * Resolve titles parameter for redirects
 * For all pages specified in titles=...|... parameter, get all links, and substitute original with the new titles=...|... parameter.
 * Resolve new titles list for redirects
 * Execute regular prop=links query using the internally created list of titles.

Request: api.php ? action=query & generator=links & titles=Title & prop=links & redirects Result: api: query: pages: TitleA: links: TitleA1: TitleA2: TitleA3: TitleC: links: TitleC1: TitleC2: redirects: TitleB: TitleC

Examples

 * Show info about 4 pages starting at the letter "T"
 * http://en.wikipedia.org/w/api.php?action=query&generator=allpages&gaplimit=4&gapfrom=T&prop=info


 * Show content of first 2 non-redirect pages begining at "Re"
 * http://en.wikipedia.org/w/api.php?action=query&generator=allpages&gaplimit=2&gapfilterredir=nonredirects&gapfrom=Re&prop=revisions&rvprop=content