API:Main page

 Attention visitors

This page describes the ongoing efforts to provide an external API to the MediaWiki servers.

MediaWiki at present has four interfaces:
 * MediaWiki API - the new, partially implemented API described on this page.
 * Query API - older API for retrieving data (will be obsolete upon new API completion).
 * Special:Export feature (bulk export of xml formatted data)
 * Regular Web-based interface

The goal of this API is to provide direct, high-level access to the data contained in the MediaWiki databases. Client programs should be able to use the API to login, get data, and post changes. The API must support thin web-based JavaScript clients, such as Navigation popups or LiveRC, end-user applications (such as vandal fighter), or be accessed by another web site (tool server's utilities).

All output will be available in a structured tree format such as XML, JSON, YAML, WDDX, or PHP serialized. A strongly typed RSS or WSDL-style format might also be implemented using wrappers.

Each API module uses a set of parameters. To prevent name collision, each module has a two letter abbreviation, and each parameter name begins with those two letters. For example, the action=login has prefix lg for all of its parameters: "lgname" and "lgpassword".

Using API internally by other code (done)
Sometimes other parts of the code may wish to use the data access and aggregation functionality of the API. Here are the steps needed to accomplish such usage:

1) Prepare request parameters using FauxRequest class. All parameters are the same as if making the request over the web.

2) Create and execute ApiMain instance. Because the parameter is an instance of a FauxRequest object, ApiMain will not execute any formatting printers, nor will it handle any errors. A parameter error or any other internal error will cause an exception that may be caught in the calling code.

3) Get the resulting data array.

Login / lg (done)
Login gets several tokens that are needed by the server to recognize logged-in user. In every call to api.php, the three values must either be passed as additional parameters, or as cookies within the request header. If any of the login values are given as part of the request, all cookie values are ignored. Please note that user name is passed in as lgname, but returned as normalized lgusername. The first is used for authentication, whereas the second may be passed together with lgtoken and lguserid as tokens when making calls to other modules.

Note: In this and other examples, all parameters are passed in a GET request just for the sake of simplicity. In your application, make sure all large and/or security sensitive parameters are given as part of the POST request. Request: api.php ? action=login & lgname=Yurik & lgpassword=12345 [& lgdomain=wikipedia.org] Result: api: login: result: Success        Other values: NoName, Illegal, WrongPluginPass, NotExists, WrongPass, EmptyPass lgtoken: 123ABC        Also returned as a cookie (e.g. enwikiToken) lgusername: Yurik      Normalized lgname,  also returned as a cookie (e.g. enwikiUserName) lguserid: 12345        Also returned as a cookie (e.g. enwikiUserID)

To use the above values, pass them without alteration to any api.php call in addition to other parameters. Here, a rollback token is acquired for the Main Page (restricted operation): api.php ? action=query & lgtoken=123ABC & lgusername=Yurik & lguserid=23456 & prop=info & intokens=rollback & titles=Main Page


 * Example
 * http://en.wikipedia.org/w/api.php?action=login&lgname=user&lgpassword=password


 * Important
 * For security reasons, a throttle has been implemented for this method. A failed log-in attempt will require that you either authenticate through the standard Special:Userlogin or wait 60 seconds before you can attempt to log-in through this module again. This throttle is only enabled on servers that support memcaching.

OpenSearch support (done)
This module allows web browsers (Firefox 2.0 at this time) an auto-suggest functionality in the search box. The module needs to be extremelly fast, and provide a simple JSON-formatted output in the form of

Since the server might be hit on every user keystroke, the potential server load might be so heavy as to move this feature to separate server(s).

WatchList RSS/ATOM feeds (done)
This module returns watchlist data in a feed format. The potential performance impact is still being evaluated.

Overview
Query API module allows applications to get needed pieces of data from the MediaWiki databases, and is loosely based on the Query API interface currently available on all MediaWiki servers. All data modifications will first have to use query to acquire a token to prevent abuse from malicious sites.

Title Normalization (done)

 * Converts improper page titles to their proper form. Capitalizes first character, replaces '_' with ' ', changes canonical namespace names to their localized alternatives, etc.

Request: Note: articleA's first letter is not capitalized api.php ? action=query & titles=Project:articleA|ArticleB Result: api: query: pages: Wikipedia:ArticleA:           Project: is converted to Wikipedia: when running on en-wiki. ns: 4                       Show title's namespace except when ns=0 ArticleB: normalized:                     Any requested titles not in the "proper" form will be here Project:articleA: Wikipedia:ArticleA


 * Example
 * http://en.wikipedia.org/w/api.php?action=query&titles=Project:articleA|ArticleB

Redirects (done)

 * Redirects can be resolved by the server, so that the target of redirect is returned instead of the given title. This example is not very useful without additional prop=... element, but shows the usage of redirect function. The 'redirects' section will contain the target of redirect and non-zero namespace code. Both normalization and redirection may take place. In case of redirect to a redirect, all redirections will be solved, and in case of a circular redirection, there might not be a page in the 'pages' section.

Request: api.php ? action=query & titles=Main page & redirects Result: api: query: pages: Main Page: redirects: Main page: Main Page
 * Same request without the "redirects" parameter would treat "Main page" as a regular page, so revisions and other information may be obtained. In order to see that it is a redirect, the basic page info must be requested using prop=info.

Request: api.php ? action=query & titles=Main page & prop=info Result: api: query: pages: Main page: id: 12342 redirect:


 * Example
 * http://en.wikipedia.org/w/api.php?action=query&titles=Main%20page&redirects
 * http://en.wikipedia.org/w/api.php?action=query&titles=Main%20page

Circular Redirects (done)

 * Assume Page1 &rarr; Page2 &rarr; Page3 &rarr; Page1 (circular redirect). Also, in this example a non-normalized name 'page1' is used.

Request: api.php ? action=query & titles=page1 & redirects Result: api: query: redirects: Page1: Page2     Redirects are present, but not the 'pages' element. Page2: Page3 Page3: Page1 normalized: page1: Page1

Limits

 * To prevent server overloads, each query imposes a limit on how many items it can process. Anonymous and logged-in users have one limit, while bots and sysops have a considerably higher limit as they are trusted by the community. At present, each query simply lists the maximum request size it allows. For example, allpages list will allow aplimit= to be set no higher than 500, or in case of a bot or a sysop - no higher than 5000.
 * Drawbacks: Currently all limits are additive, so if the user requests allpages and backlinks, the user will get 500 of each. This is not very good, as the more items are compounded into one request, the heavier the load on the server will be. Instead, some sort of a weighted mechanism should be developed, where each request item has a certain "cost" associated with it, and each user is allocated a fixed allowance per request. The more information user requests, the less the limit becomes for that request. Unfortunately, that makes it very hard to figure out the maximum limits before executing the query, so might not be a workable solution.

Query - Meta-Information
Meta queries allow clients to retrieve the data about the MediaWiki settings itself.

To get meta information, clients will use meta= parameter: api.php ? action=query & meta=siteinfo|userinfo & ...

siteinfo / si (done)

 * Returns overall site information.
 * Parameters: siprop=general|namespaces|interwikimap, sifilteriw=local|!local


 * Example
 * http://en.wikipedia.org/w/api.php?action=query&meta=siteinfo


 * Future
 * * Possible future addition: include the license that's being selected for the wiki's content (e.g. GFDL, CC-SA, no license specified, etc.).

userinfo / ui

 * Returns information about the current user. This will be implemented similar to the method used by query.php ([/w/query.php?what=userinfo&uiisblocked&uihasmsg&uiextended query.php example]).
 * Parameters: uiprop=isblocked|hasmsg|rights|groups, uioptions= |...


 * Example
 * http://en.wikipedia.org/w/api.php?action=query&meta=userinfo

Query - Page Information
Page information items are used to get various data about a list of pages specified with either the titles=, pageids=, or revids= parameters, or by using. Content, links, interwiki links, and other information may be obtained.

info / in (done except tokens)

 * Gets the basic page information such as pageid, last revid, redirect, last touched, etc. Limit: 500/5000.
 * Parameters: intokens=edit|rollback|delete|protect|move
 * Issues: Should there be tokens for rollback/delete/protect/move be available in this way, as oppose to having an action= for each task? There is a potential for abuse, as someone might have a link on their website to wiki, and that link would contain a "delete" action. If a logged in admin clicks on that link, the api will recognize them because of their cookie, and will allow the deletion.

Request: api.php ? action=query & prop=info & titles=TitleA Result: api: query: pages: TitleA: id: 12341 lastrev: 23456 touched: 20060908025739

categories / cl (done)

 * Gets a list of all categories used on the provided pages.
 * Parameters: clprop=sortkey (optional)

Request: api.php ? action=query & prop=categories & titles=TitleA Result: api: query: pages: TitleA: categories: Category:Cat1: Category:Cat2:

Content (done)

 * Requesting content should be done by requesting the last revision with content property.

api.php ? action=query & prop=revisions & rvprop=content & titles=ArticleA|ArticleB

imageinfo / ii

 * Gets image information for any titles in the image namespace (#6).
 * Parameters: iiprop=url|history|comment|stats|user|timestamp, iisource=local/shared/all (dflt=local)
 * url - path to the image, history - include every old image versions, stats - image size/type, user - uploader, iisource - look at the local or shared (commons) image repository, or both.
 * Example: Get comments for all image uploads, both local and in the commons repository. Here, ImageA was uploaded 3 times to the local wiki, and 2 times to the shared (commons) repository.

Request: api.php ? action=query & prop=imageinfo & titles=Image:ImageA & iiprop=comment|history & iisource=all Result: api: query: pages: Image:ImageA: ns:6 imageinfo: local: comment: last update comment localhistory: -                                       history is an unordered list of items comment: some update -                comment: another update shared: comment: last update on commons sharedhistory: -                comment: some update on commons

langlinks / ll (done)

 * Gets a list of all language links (interwikies) from the provided pages to other languages. Limit: 200/1000.

links / pl (done)

 * Gets a list of all links from the provided pages. Limit: 200/1000.
 * Parameters: plnamespace (flt).

templates / tl (done)

 * Gets a list of all templates used on the provided pages. Limit: 200/1000.

images / im (done)
''In Query API interface, this command found pages that embedded the given image. It has been renamed to imageusage.
 * Gets a list of all images used on the provided pages. Limit: 200/1000.

extlinks / el (done)

 * Gets a list of all external urls on the page(s). Limit: 200/1000.

Query - Revisions (done)
Returns revisions for a given article based on the selection criteria. Revisions may be used with multiple titles only when working with the latest revision. When using rvlimit, rvdir=newer, rvstart, or rvend parameters, titles= must have only one title listed. By default, revisions shows only the id of the last revision. Request: api.php ? action=query & prop=revisions & titles=ArticleA & rvprop=timestamp|user|comment|content Result: api: query: pages: ArticleA: id: 12345 lastrev: 67890 revisions: 67890:              timestamp: 20060908025739 user: UserX comment: ...change comment...              content: ...raw revision content...


 * Additional 'revisions' samples
 * Get the timestamps of up to 10 revisions, beginning at 2006-09-01 and moving forward in time.

api.php ? action=query & prop=revisions & titles=ArticleA & rvprop=timestamp & rvlimit=10 & rvdir=newer & rvstart=20060901000000
 * Get the timestamps of all revisions for the entire month of September 2006. rvlimit is optional. If the number of revisions exceeds the limit, the 'revisions' element will contain  'continue':'rvstart=20060920122343'  with the timestamp to continue from.

api.php ? action=query & prop=revisions & titles=ArticleA & rvprop=timestamp & rvstart=20060901000000 & rvend=20061001000000
 * Get the timestamps of up to 10 revisions, beginning at 12345 and moving back in time. If more than 10 revisions are available, 'revisions' element will contain  'continue':'revids=23512' , where revid is the next revision id in order.

api.php ? action=query & prop=revisions & revids=12345 & rvprop=timestamp & rvlimit=10 & rvdir=older
 * Get the timestamps of all revisions between two given revision IDs. rvlimit is optional. If the number of revisions exceeds the limit, the 'revisions' element will contain  'continue':'rvstartid=23512'  with the revid to continue from. Both rvstartid & rvendid must belong to the same title. The titles= parameter is not required, but if given, it must be set to the same title as revision IDs.

api.php ? action=query & prop=revisions & rvprop=timestamp & rvstartid=12345 & rvendid=67890

Examples

 * Get data with content for the last revision of titles "API" and "Main Page":
 * http://en.wikipedia.org/w/api.php?action=query&prop=revisions&titles=API|Main%20Page&rvprop=timestamp|user|comment|content


 * Get last 5 revisions of the "Main Page":
 * http://en.wikipedia.org/w/api.php?action=query&prop=revisions&titles=Main%20Page&rvlimit=5&rvprop=timestamp|user|comment


 * Get first 5 revisions of the "Main Page":
 * http://en.wikipedia.org/w/api.php?action=query&prop=revisions&titles=Main%20Page&rvlimit=5&rvprop=timestamp|user|comment&rvdir=newer


 * Get first 5 revisions of the "Main Page" made after 2006-05-01:
 * http://en.wikipedia.org/w/api.php?action=query&prop=revisions&titles=Main%20Page&rvlimit=5&rvprop=timestamp|user|comment&rvdir=newer&rvstart=20060501000000

Query - Lists
Lists differ from other properties in two aspects - instead of appending data to the elements under 'pages' element, each list has its own separated branch under 'query' element. Also, list output is limited by number of items, and may be continued using "paging" technique. Even when no limit is provided, the query will only return a set number of items, and will also provide a string point from which to continue paging. See allpages list for an example.

allpages / ap (done)

 * Returns a list of pages in a given namespace starting at from, ordered by page title.
 * Parameters: apfrom (paging), apnamespace (dflt=0), apredirect (flt), aplimit (dflt=10, max=500/5000)


 * Example: Request a list of 3 pages from namespace 10 (templates) beginning at the first available page.

Request: api.php ? action=query & list=allpages & apnamespace=10 & aplimit=3 Result: api: query: allpages: Template:A-Article: id: 12341 ns: 10 Template:B-Article: id: 12342 ns: 10 Template:C-Article: id: 12343 ns: 10 query-status: allpages: continue: apfrom=D-Article   The next item in this list would have been Template:D-Article.
 * The client may now make another request using the continue value as a parameter:

api.php ? action=query & list=allpages & apnamespace=10 & aplimit=3 & apfrom=D-Article

backlinks / bl (done without redirects)

 * Lists pages that link to the given page. Ordered by linking page title.
 * Parameters: bltitle, blfrom (paging), blnamespace (flt), blredirect (flt), bllimit (dflt=10, max=500/5000)

api.php ? action=query & list=backlinks & bltitle=ArticleA

categorymembers / cm

 * List of pages that belong to a given category, ordered by page title.
 * Parameters: cmtitle (if title is in NS 0, treats it as category NS), cmfrom (paging), cmnamespace (flt), cmlimit (dflt=10, max=500/5000)

api.php ? action=query & list=categorymembers & cmtitle=category:title

embeddedin / ei (done without redirects)

 * What pages include template:title page as a template. List of pages that include the given page using . Ordered by including page title.
 * Parameters: eititle, eifrom (paging), einamespace (flt), eiredirect (flt), eilimit (dflt=10, max=500/5000)

api.php ? action=query & list=embeddedin & eititle=template:title

extlinksusage / eu

 * What pages contain a given URL (or its part)
 * Parameters: euurl, eufrom (paging), eunamespace (flt), eulimit (dflt=10, max=500/5000)
 * euurl must begin with one of the supported protocols (http, https, mailto, ...). The server name may begin with a '*.' in front of the server name, and the path may end with another '*'. See Special:LinkSearch for similar functionality.

imageusage / iu (done)
''This was renamed from imagelinks in query.php, and from imgebeddedin in the earlier API version to avoid confusion. images will now be used to get all images used on a given page''.
 * List of pages that include a given image. Ordered by page title.
 * Parameters: ietitle (if image title is in NS 0, treats it as an image NS), iefrom (paging), ienamespace (flt), ielimit (dflt=10, max=500/5000)

api.php ? action=query & list=imageusage & ietitle=image:title

logevents / le (semi-complete)

 * List log events, filtered by time range, event type, user type, or the page it applies to. Ordered by event timestamp.
 * Parameters: letype (flt), lefrom (paging timestamp), leto (flt), ledirection (dflt=older), leuser (flt), letitle (flt), lelimit (dflt=10, max=500/5000)

api.php ? action=query & list=logevents     - List last 10 events of any type

recentchanges / rc (done)

 * Gets a list of pages recently changed, ordered by modification timestamp.
 * Parameters: rcfrom (paging timestamp), rcto (flt), rcnamespace (flt), rcminor (flt), rcusertype (dflt=not|bot), rcdirection (dflt=older), rclimit (dflt=10, max=500/5000)

api.php ? action=query & list=recentchanges - List last 10 changes

usercontribs / uc (semi-complete, needs parameter revision)

 * Gets a list of pages modified by a given user, ordered by modification time.
 * Parameters: ucuser, ucfrom (paging timestamp), ucto (flt), ucnamespace (flt), ucminor (flt), uctop (flt), ucdirection (dflt=older), uclimit (dflt=10, max=500/5000)

api.php ? action=query & list=usercontribs & ucuser=UserA  - List last 10 changes made by userA

users / us

 * Gets a list of registered users, ordered by user name.
 * Parameters: usfrom (paging), uslimit (dflt=10, max=500/5000)

watchlist / wl (done)

 * Get a list of pages on the user's watchlist but only if they were changed within the given time period. Ordered by time of the last change of the watched page.
 * Parameters: wlfrom (paging timestamp), wlto (flt), wlnamespace (flt), wldirection (dflt=older), wllimit (dflt=10, max=500/5000)

Query - Generators (done)
Generator is way to use one of the above instead of the titles= parameter. The output of the list must be a list of pages, whose titles get automatically used instead of the titles=/revids=/pageids= parameters. Other queries such as content, revisions, etc, will treat those pages as if they were provided by the user in the titles= parameter. Only one generator is allowed, and while it is possible to have both generator= and list= parameters in the same call, they may not contain the same values.

Using allpages as generator
Use the allpages list as a generator, to get the links and categories for all titles returned by allpages. Request: api.php ? action=query & generator=allpages & apnamespace=3 & aplimit=10 & apfrom=A & prop=links|categories Result: api: query: pages: Template:A-Article: id: 12341 ns: 10 links: Linked Article1:           Linked Article1 is in the main namespace Talk:Linked Article2:      For non-main ns, list it as a sub-element ns: 1 ...          categories: Category:Cat1: Category:Cat2: ...        Template:B-Article: ...        Template:C-Article: ...    query-status: allpages: continue: apfrom=D-Article     The next item in this list would have been Template:D-Article.

Generators and redirects
Here, we use "links" page property as a generator. This query will get all the links from all the pages that are linked from Title. For this example, assume that Title has links to TitleA and TitleB. TitleB is a redirect to TitleC. TitleA links to TitleA1, TitleA2, TitleA3; and TitleC links to TitleC1 & TitleC2. Redirect is solved because of the "redirects" parameter.


 * The query will execute the following steps:
 * Resolve titles parameter for redirects
 * For all pages specified in titles=...|... parameter, get all links, and substitute original with the new titles=...|... parameter.
 * Resolve new titles list for redirects
 * Execute regular prop=links query using the internally created list of titles.

Request: api.php ? action=query & generator=links & titles=Title & prop=links & redirects Result: api: query: pages: TitleA: links: TitleA1: TitleA2: TitleA3: TitleC: links: TitleC1: TitleC2: redirects: TitleB: TitleC

Examples

 * Show info about 4 pages starting at the letter "T"
 * http://en.wikipedia.org/w/api.php?action=query&generator=allpages&gaplimit=4&gapfrom=T&prop=info


 * Show content of first 2 non-redirect pages begining at "Re"
 * http://en.wikipedia.org/w/api.php?action=query&generator=allpages&gaplimit=2&gapfilterredir=nonredirects&gapfrom=Re&prop=revisions&rvprop=content

Posting Data / needs major editPage.php rewrite
 Need Help At present, user interface code is tightly woven with the database access code, making it unusable for the API. These two must be separated from one another – we need a clean data access layer without any UI logic. If you want to contribute with rewriting EditPage.php, and if you know PHP and MediaWiki, or you think you can learn it, please give us a hand at making this possible. --Yurik 15:07, 17 October 2006 (UTC)

action=submit allows data to be posted back to the MediaWiki servers. For this to work, the client must first obtain an edittoken by using prop=info & intokens=edit query call. Both the lastrev and the token have to be sent to the server, together with the title of the page, its content, and the summary comment. disablemerge parameter stops the save operation in case the article has been modified after the query call. testrun parameter attempts the save operation by merging the content with the newer changes (if needed), and returning how the page would look like if it was saved, but without actually changing any data.

Note: The parameters should be modified to allow for the controlled merge. For example: rev #1 is received, an attempt is made to save changes to it, but rev #2 has been created in the meantime. The client decides to allow merge with rev #2, but while the decision is made, the rev #3 has been published. The client should have the option to only allow merging with rev #2 which was verified, not with rev#3 that it has not yet seen.

Request: api.php ? action=submit & title=Project:articleA & edittoken=abc123 & revid=12345 & summary=edit_comment & content=wikitext [& minorEdit] [& disablemerge] [& testrun] Result: api: save: status: Success            Other values: 'Prohibited', 'Conflict', 'DbLocked', 'BadToken', 'MergeRequired' (for the testrun: 'CanMerge', 'CanSaveAsIs'       title: Wikipedia:ArticleA   Always returns normalized title       ns: 4                       Show title's namespace except when ns=0       id: 12345                   On success, the ID of the page       revid: 67891                On success, the new latest revision id       redirect:                   On success, when saved page is now treated as a redirect       content: wiki content       When used with testrun'', this field will be set to the merge result

Moving/Renaming Pages
Request api.php ? action=move & mvfrom=OldTitle & mvto=NewTitle & mvtoken=123ABC [& mvoverride]

Implementation Strategy
See /Implementation Strategy.

Wikimania 2006 API discussion
See /Wikimania 2006 API discussion.

Useful Links

 * API source code in SVN
 * Proposed Database Schema Changes
 * Database layout
 * The current DB schema in SVN