API:Query

Overview
Query API module allows applications to get needed pieces of data from the MediaWiki databases, and is loosely based on the Query API interface currently available on all MediaWiki servers. All data modifications will first have to use query to acquire a token to prevent abuse from malicious sites.

Title Normalization (done)

 * Converts improper page titles to their proper form. Capitalizes first character, replaces '_' with ' ', changes canonical namespace names to their localized alternatives, etc.

Request: Note: articleA's first letter is not capitalized api.php ? action=query & titles=Project:articleA|ArticleB Result: api: query: pages: Wikipedia:ArticleA:           Project: is converted to Wikipedia: when running on en-wiki. ns: 4                       Show title's namespace except when ns=0 ArticleB: normalized:                     Any requested titles not in the "proper" form will be here Project:articleA: Wikipedia:ArticleA


 * Example
 * http://en.wikipedia.org/w/api.php?action=query&titles=Project:articleA|ArticleB

Redirects (done)

 * Redirects can be resolved by the server, so that the target of redirect is returned instead of the given title. This example is not very useful without additional prop=... element, but shows the usage of redirect function. The 'redirects' section will contain the target of redirect and non-zero namespace code. Both normalization and redirection may take place. In case of redirect to a redirect, all redirections will be solved, and in case of a circular redirection, there might not be a page in the 'pages' section.

Request: api.php ? action=query & titles=Main page & redirects Result: api: query: pages: Main Page: redirects: Main page: Main Page
 * Same request without the "redirects" parameter would treat "Main page" as a regular page, so revisions and other information may be obtained. In order to see that it is a redirect, the basic page info must be requested using prop=info.

Request: api.php ? action=query & titles=Main page & prop=info Result: api: query: pages: Main page: id: 12342 redirect:


 * Example
 * http://en.wikipedia.org/w/api.php?action=query&titles=Main%20page&redirects
 * http://en.wikipedia.org/w/api.php?action=query&titles=Main%20page

Circular Redirects (done)

 * Assume Page1 &rarr; Page2 &rarr; Page3 &rarr; Page1 (circular redirect). Also, in this example a non-normalized name 'page1' is used.

Request: api.php ? action=query & titles=page1 & redirects Result: api: query: redirects: Page1: Page2     Redirects are present, but not the 'pages' element. Page2: Page3 Page3: Page1 normalized: page1: Page1

Limits

 * To prevent server overloads, each query imposes a limit on how many items it can process. Anonymous and logged-in users have one limit, while bots and sysops have a considerably higher limit as they are trusted by the community. At present, each query simply lists the maximum request size it allows. For example, allpages list will allow aplimit= to be set no higher than 500, or in case of a bot or a sysop - no higher than 5000.
 * Drawbacks: Currently all limits are additive, so if the user requests allpages and backlinks, the user will get 500 of each. This is not very good, as the more items are compounded into one request, the heavier the load on the server will be. Instead, some sort of a weighted mechanism should be developed, where each request item has a certain "cost" associated with it, and each user is allocated a fixed allowance per request. The more information user requests, the less the limit becomes for that request. Unfortunately, that makes it very hard to figure out the maximum limits before executing the query, so might not be a workable solution.

Query - Meta-Information
Meta queries allow clients to retrieve the data about the MediaWiki settings itself.

To get meta information, clients will use meta= parameter: api.php ? action=query & meta=siteinfo|userinfo & ...

siteinfo / si (done)

 * Returns overall site information.
 * Parameters: siprop=general|namespaces|interwikimap, sifilteriw=local|!local


 * Example
 * http://en.wikipedia.org/w/api.php?action=query&meta=siteinfo


 * Future
 * * Possible future addition: include the license that's being selected for the wiki's content (e.g. GFDL, CC-SA, no license specified, etc.).

userinfo / ui

 * Returns information about the current user. This will be implemented similar to the method used by query.php ([/w/query.php?what=userinfo&uiisblocked&uihasmsg&uiextended query.php example]).
 * Parameters: uiprop=isblocked|hasmsg|rights|groups, uioptions= |...


 * Example
 * http://en.wikipedia.org/w/api.php?action=query&meta=userinfo

Query - Page Information
Page information items are used to get various data about a list of pages specified with either the titles=, pageids=, or revids= parameters, or by using. Content, links, interwiki links, and other information may be obtained.

info / in (done except tokens)

 * Gets the basic page information such as pageid, last revid, redirect, last touched, etc. Limit: 500/5000.
 * Parameters: intokens=edit|rollback|delete|protect|move
 * Issues: Should there be tokens for rollback/delete/protect/move be available in this way, as oppose to having an action= for each task? There is a potential for abuse, as someone might have a link on their website to wiki, and that link would contain a "delete" action. If a logged in admin clicks on that link, the api will recognize them because of their cookie, and will allow the deletion.

Request: api.php ? action=query & prop=info & titles=TitleA Result: api: query: pages: TitleA: id: 12341 lastrev: 23456 touched: 20060908025739

categories / cl (done)

 * Gets a list of all categories used on the provided pages.
 * Parameters: clprop=sortkey (optional)

Request: api.php ? action=query & prop=categories & titles=TitleA Result: api: query: pages: TitleA: categories: Category:Cat1: Category:Cat2:

Content (done)

 * Requesting content should be done by requesting the last revision with content property.

api.php ? action=query & prop=revisions & rvprop=content & titles=ArticleA|ArticleB

imageinfo / ii

 * Gets image information for any titles in the image namespace (#6).
 * Parameters: iiprop=url|history|comment|stats|user|timestamp, iisource=local/shared/all (dflt=local)
 * url - path to the image, history - include every old image versions, stats - image size/type, user - uploader, iisource - look at the local or shared (commons) image repository, or both.
 * Example: Get comments for all image uploads, both local and in the commons repository. Here, ImageA was uploaded 3 times to the local wiki, and 2 times to the shared (commons) repository.

Request: api.php ? action=query & prop=imageinfo & titles=Image:ImageA & iiprop=comment|history & iisource=all Result: api: query: pages: Image:ImageA: ns:6 imageinfo: local: comment: last update comment localhistory: -                                       history is an unordered list of items comment: some update -                comment: another update shared: comment: last update on commons sharedhistory: -                comment: some update on commons

langlinks / ll (done)

 * Gets a list of all language links (interwikies) from the provided pages to other languages. Limit: 200/1000.

links / pl (done)

 * Gets a list of all links from the provided pages. Limit: 200/1000.
 * Parameters: plnamespace (flt).

templates / tl (done)

 * Gets a list of all templates used on the provided pages. Limit: 200/1000.

images / im (done)
''In Query API interface, this command found pages that embedded the given image. It has been renamed to imageusage.
 * Gets a list of all images used on the provided pages. Limit: 200/1000.

extlinks / el (done)

 * Gets a list of all external urls on the page(s). Limit: 200/1000.

Query - Revisions (done)
Returns revisions for a given article based on the selection criteria. Revisions may be used with multiple titles only when working with the latest revision. When using rvlimit, rvdir=newer, rvstart, or rvend parameters, titles= must have only one title listed. By default, revisions shows only the id of the last revision. Request: api.php ? action=query & prop=revisions & titles=ArticleA & rvprop=timestamp|user|comment|content Result: api: query: pages: ArticleA: id: 12345 lastrev: 67890 revisions: 67890:              timestamp: 20060908025739 user: UserX comment: ...change comment...              content: ...raw revision content...


 * Additional 'revisions' samples
 * Get the timestamps of up to 10 revisions, beginning at 2006-09-01 and moving forward in time.

api.php ? action=query & prop=revisions & titles=ArticleA & rvprop=timestamp & rvlimit=10 & rvdir=newer & rvstart=20060901000000
 * Get the timestamps of all revisions for the entire month of September 2006. rvlimit is optional. If the number of revisions exceeds the limit, the 'revisions' element will contain  'continue':'rvstart=20060920122343'  with the timestamp to continue from.

api.php ? action=query & prop=revisions & titles=ArticleA & rvprop=timestamp & rvstart=20060901000000 & rvend=20061001000000
 * Get the timestamps of up to 10 revisions, beginning at 12345 and moving back in time. If more than 10 revisions are available, 'revisions' element will contain  'continue':'revids=23512' , where revid is the next revision id in order.

api.php ? action=query & prop=revisions & revids=12345 & rvprop=timestamp & rvlimit=10 & rvdir=older
 * Get the timestamps of all revisions between two given revision IDs. rvlimit is optional. If the number of revisions exceeds the limit, the 'revisions' element will contain  'continue':'rvstartid=23512'  with the revid to continue from. Both rvstartid & rvendid must belong to the same title. The titles= parameter is not required, but if given, it must be set to the same title as revision IDs.

api.php ? action=query & prop=revisions & rvprop=timestamp & rvstartid=12345 & rvendid=67890

Examples

 * Get data with content for the last revision of titles "API" and "Main Page":
 * http://en.wikipedia.org/w/api.php?action=query&prop=revisions&titles=API|Main%20Page&rvprop=timestamp|user|comment|content


 * Get last 5 revisions of the "Main Page":
 * http://en.wikipedia.org/w/api.php?action=query&prop=revisions&titles=Main%20Page&rvlimit=5&rvprop=timestamp|user|comment


 * Get first 5 revisions of the "Main Page":
 * http://en.wikipedia.org/w/api.php?action=query&prop=revisions&titles=Main%20Page&rvlimit=5&rvprop=timestamp|user|comment&rvdir=newer


 * Get first 5 revisions of the "Main Page" made after 2006-05-01:
 * http://en.wikipedia.org/w/api.php?action=query&prop=revisions&titles=Main%20Page&rvlimit=5&rvprop=timestamp|user|comment&rvdir=newer&rvstart=20060501000000

Query - Lists
Lists differ from other properties in two aspects - instead of appending data to the elements under 'pages' element, each list has its own separated branch under 'query' element. Also, list output is limited by number of items, and may be continued using "paging" technique. Even when no limit is provided, the query will only return a set number of items, and will also provide a string point from which to continue paging. See allpages list for an example.

allpages / ap (done)

 * Returns a list of pages in a given namespace starting at from, ordered by page title.
 * Parameters: apfrom (paging), apnamespace (dflt=0), apredirect (flt), aplimit (dflt=10, max=500/5000)


 * Example: Request a list of 3 pages from namespace 10 (templates) beginning at the first available page.

Request: api.php ? action=query & list=allpages & apnamespace=10 & aplimit=3 Result: api: query: allpages: Template:A-Article: id: 12341 ns: 10 Template:B-Article: id: 12342 ns: 10 Template:C-Article: id: 12343 ns: 10 query-status: allpages: continue: apfrom=D-Article   The next item in this list would have been Template:D-Article.
 * The client may now make another request using the continue value as a parameter:

api.php ? action=query & list=allpages & apnamespace=10 & aplimit=3 & apfrom=D-Article

backlinks / bl (done without redirects)

 * Lists pages that link to the given page. Ordered by linking page title.
 * Parameters: bltitle, blfrom (paging), blnamespace (flt), blredirect (flt), bllimit (dflt=10, max=500/5000)

api.php ? action=query & list=backlinks & bltitle=ArticleA

categorymembers / cm (done)

 * List of pages that belong to a given category, ordered by page sort title.
 * Parameters: cmcategory, cmprop (properties), cmcontinue (paging), cmnamespace (flt), cmlimit (dflt=10, max=500/5000)

api.php ? action=query & list=categorymembers & cmcategory=CatName

embeddedin / ei (done without redirects)

 * What pages include template:title page as a template. List of pages that include the given page using . Ordered by including page title.
 * Parameters: eititle, eifrom (paging), einamespace (flt), eiredirect (flt), eilimit (dflt=10, max=500/5000)

api.php ? action=query & list=embeddedin & eititle=template:title

extlinksusage / eu

 * What pages contain a given URL (or its part)
 * Parameters: euurl, eufrom (paging), eunamespace (flt), eulimit (dflt=10, max=500/5000)
 * euurl must begin with one of the supported protocols (http, https, mailto, ...). The server name may begin with a '*.' in front of the server name, and the path may end with another '*'. See Special:LinkSearch for similar functionality.

imageusage / iu (done)
''This was renamed from imagelinks in query.php, and from imgebeddedin in the earlier API version to avoid confusion. images will now be used to get all images used on a given page''.
 * List of pages that include a given image. Ordered by page title.
 * Parameters: ietitle (if image title is in NS 0, treats it as an image NS), iefrom (paging), ienamespace (flt), ielimit (dflt=10, max=500/5000)

api.php ? action=query & list=imageusage & ietitle=image:title

logevents / le (semi-complete)

 * List log events, filtered by time range, event type, user type, or the page it applies to. Ordered by event timestamp.
 * Parameters: letype (flt), lefrom (paging timestamp), leto (flt), ledirection (dflt=older), leuser (flt), letitle (flt), lelimit (dflt=10, max=500/5000)

api.php ? action=query & list=logevents     - List last 10 events of any type

recentchanges / rc (done)

 * Gets a list of pages recently changed, ordered by modification timestamp.
 * Parameters: rcfrom (paging timestamp), rcto (flt), rcnamespace (flt), rcminor (flt), rcusertype (dflt=not|bot), rcdirection (dflt=older), rclimit (dflt=10, max=500/5000)

api.php ? action=query & list=recentchanges - List last 10 changes

usercontribs / uc (semi-complete, needs parameter revision)

 * Gets a list of pages modified by a given user, ordered by modification time.
 * Parameters: ucuser, ucfrom (paging timestamp), ucto (flt), ucnamespace (flt), ucminor (flt), uctop (flt), ucdirection (dflt=older), uclimit (dflt=10, max=500/5000)

api.php ? action=query & list=usercontribs & ucuser=UserA  - List last 10 changes made by userA

users / us

 * Gets a list of registered users, ordered by user name.
 * Parameters: usfrom (paging), uslimit (dflt=10, max=500/5000)

watchlist / wl (done)

 * Get a list of pages on the user's watchlist but only if they were changed within the given time period. Ordered by time of the last change of the watched page.
 * Parameters: wlfrom (paging timestamp), wlto (flt), wlnamespace (flt), wldirection (dflt=older), wllimit (dflt=10, max=500/5000)

Query - Generators (done)
Generator is way to use one of the above instead of the titles= parameter. The output of the list must be a list of pages, whose titles get automatically used instead of the titles=/revids=/pageids= parameters. Other queries such as content, revisions, etc, will treat those pages as if they were provided by the user in the titles= parameter. Only one generator is allowed, and while it is possible to have both generator= and list= parameters in the same call, they may not contain the same values.

Using allpages as generator
Use the allpages list as a generator, to get the links and categories for all titles returned by allpages. Request: api.php ? action=query & generator=allpages & apnamespace=3 & aplimit=10 & apfrom=A & prop=links|categories Result: api: query: pages: Template:A-Article: id: 12341 ns: 10 links: Linked Article1:           Linked Article1 is in the main namespace Talk:Linked Article2:      For non-main ns, list it as a sub-element ns: 1 ...          categories: Category:Cat1: Category:Cat2: ...        Template:B-Article: ...        Template:C-Article: ...    query-status: allpages: continue: apfrom=D-Article     The next item in this list would have been Template:D-Article.

Generators and redirects
Here, we use "links" page property as a generator. This query will get all the links from all the pages that are linked from Title. For this example, assume that Title has links to TitleA and TitleB. TitleB is a redirect to TitleC. TitleA links to TitleA1, TitleA2, TitleA3; and TitleC links to TitleC1 & TitleC2. Redirect is solved because of the "redirects" parameter.


 * The query will execute the following steps:
 * Resolve titles parameter for redirects
 * For all pages specified in titles=...|... parameter, get all links, and substitute original with the new titles=...|... parameter.
 * Resolve new titles list for redirects
 * Execute regular prop=links query using the internally created list of titles.

Request: api.php ? action=query & generator=links & titles=Title & prop=links & redirects Result: api: query: pages: TitleA: links: TitleA1: TitleA2: TitleA3: TitleC: links: TitleC1: TitleC2: redirects: TitleB: TitleC

Examples

 * Show info about 4 pages starting at the letter "T"
 * http://en.wikipedia.org/w/api.php?action=query&generator=allpages&gaplimit=4&gapfrom=T&prop=info


 * Show content of first 2 non-redirect pages begining at "Re"
 * http://en.wikipedia.org/w/api.php?action=query&generator=allpages&gaplimit=2&gapfilterredir=nonredirects&gapfrom=Re&prop=revisions&rvprop=content