API:Implementation Strategy

This explains the implementation of the MediaWiki API machinery in core. If you want to provide an API in your code for clients to consume, read .

File/Module Structure

 *   is the entry point, located in the wiki root. See 1>Special:MyLanguage/API:Main page#The endpoint|API:Main page#The endpoint.
 *   will contain all files related to the API, but none of them will be allowed as entry points.
 * All API classes are derived from a common abstract class  . The base class provides common functionality such as parameter parsing, profiling, and error handling.
 *   is the main class instantiated by  . It determines which module to execute based on the   parameter.   </> also creates an instance of the <tvar|2> </> class, which contains the output data array and related helper functions.  Lastly, <tvar|1> </> instantiates the formatting class that will output the data from <tvar|2> </> in XML/JSON/PHP or other format to the client.
 * Any module derived from <tvar|1> </> will receive a reference to an instance of the <tvar|2> </> during instantiation, so that during execution the module may get shared resources such as the result object.

Query modules

 * <tvar|1> </> behaves similar to <tvar|2> </> in that it executes submodules. Each submodule derives from <tvar|1> </> (except <tvar|2> </> itself, which is a top-level module).  During instantiation, submodules receive a reference to the <tvar|1>ApiQuery</> instance.
 * All extension query modules should use a 3 or more letter prefixes. The core modules use 2 letter prefixes.
 * <tvar|1> </> execution plan:
 * Get shared query parameters <tvar|1> </> to determine needed submodules.
 * Create an <tvar|1> </> object and populate it from the <tvar|2> </> parameters. The <tvar|1> </> object contains the list of pages or revisions that query modules will work with.
 * If requested, a generator module is executed to create another <tvar|1> </>. Similar to the piping streams in UNIX.  Given pages are the input to generator that produces another set of pages for all other modules to work on.
 * Requirements for query continuation:
 * The SQL query must be totally ordered. In other words, the query must be using all columns of some unique key either as constants in the <tvar|1> </> clause or in the <tvar|2> </> clauses.
 * In MySQL, this is an exclusive or, to the point where querying <tvar|1>Foo</> and <tvar|2>Bar</> must order by title but not namespace (namespace is constant 0), <tvar|1>Foo</> and <tvar|3>Talk:Foo</> must order by namespace but not title (title is constant "<tvar|4>Foo</>"), and <tvar|1>Foo</> and <tvar|5>Talk:Bar</> must order by both namespace and title.
 * The SQL query must not filesort.
 * The value given to <tvar|1> </> must include all the columns in the <tvar|2> </> clause.
 * When continuing, a single compound condition should be added to the <tvar|1> </> clause. If the query has <tvar|1> </>, this condition should look something like this:

(column_0 > value_0 OR (column_0 = value_0 AND&#xa; (column_1 > value_1 OR (column_1 = value_1 AND&#xa; (column_2 >= value_2)&#xa; ))&#xa;))

Of course, swap ">" for "<" if your <tvar|1> </> columns are using <tvar|2> </>. Be sure to avoid SQL injection in the values.

Internal data structures

 * Query API has had very successful structure of one global nested <tvar|1> </> structure passed around. Various modules would add pieces of data to many different points of that array, until, finally, it would get rendered for the client by one of the printers (output modules).  For the API, we suggest wrapping this array as a class with helper functions to append individual leaf nodes.

Error/status reporting
For now we decided to include error information inside the same structured output as normal result (option #2).

For the result, we may either use the standard HTTP error codes, or always return a properly formatted data:

void header( string reason_phrase [, bool replace [, int http_response_code]] ) The <tvar|1> </> can be used to set the return status of the operation. We can define all possible values of the <tvar|1> </>, so for the failed login we may return <tvar|2> </> and <tvar|3> </>, whereas for any success we would simply return the response without altering the header.
 * Using HTTP code

Pros: It's a standard. The client always has to deal with HTTP errors, so using HTTP code for result would remove any separate error handling the client would have to perform. Since the client may request data in multiple formats, an invalid format parameter would still be properly handled, as it will simply be another http error code.

Cons: ...

This method would always return a properly formatted response object, but the error status/description will be the only values inside that object. This is similar to the way current Query API returns status codes.
 * Include error information inside a proper response

Pros: HTTP error codes are used only for the networking issues, not for the data (logical errors). We do not tied to the existing HTTP error codes.

Cons: If the data format parameter is not properly specified, what is the format of the output data? Application has to parse the object to know of an error (perf?). Error checking code will have to be on both the connection and data parsing levels.