Jump to content


From mediawiki.org
This page is a translated version of the page API:Implementation Strategy and the translation is 20% complete.

本頁面旨在解释在MediaWiki API机器的核心中的实现。



  • api.php是入口点,位于维基根目录。 请参见API:Main page#The endpoint
  • includes/api将包含所有与应用程序接口相关的文件,但不允许将其中任何文件作为入口点。
  • 所有应用程序接口类都源自一个通用的抽象类--ApiBase。 基类提供参数解析、剖析和错误处理等常用功能。
  • ApiMain是由api.php实例化後的主类。 它根据action=XXX参数决定执行哪个模块。 ApiMain还会创建一个ApiResult类实例,其中包含输出数据数组和相关辅助函数。 最后,ApiMain实例化了格式化类,该类将以 XML/JSON/PHP 或其他格式向客户端输出来自ApiResult的数据。
  • 任何从ApiBase派生的模块都会在实例化过程中收到对ApiMain实例的引用,因此在执行过程中模块可以获得共享资源,如结果对象。

Query modules

  • ApiQuery behaves similar to ApiMain in that it executes submodules. Each submodule derives from ApiQueryBase (except ApiQuery itself, which is a top-level module). During instantiation, submodules receive a reference to the ApiQuery instance.
  • All extension query modules should use a 3 or more letter prefixes. The core modules use 2 letter prefixes.
  • ApiQuery execution plan:
    1. Get shared query parameters list/prop/meta to determine needed submodules.
    2. Create an ApiPageSet object and populate it from the titles/pageids/revids parameters. The pageset object contains the list of pages or revisions that query modules will work with.
    3. If requested, a generator module is executed to create another PageSet. Similar to the piping streams in UNIX. Given pages are the input to generator that produces another set of pages for all other modules to work on.
  • Requirements for query continuation:
    • The SQL query must be totally ordered. In other words, the query must be using all columns of some unique key either as constants in the WHERE clause or in the ORDER BY clauses.
      • In MySQL, this is an exclusive or, to the point where querying Foo and Bar must order by title but not namespace (namespace is constant 0), Foo and Talk:Foo must order by namespace but not title (title is constant "Foo"), and Foo and Talk:Bar must order by both namespace and title.
    • The SQL query must not filesort.
    • The value given to setContinueEnumParameter() must include all the columns in the ORDER BY clause.
    • When continuing, a single compound condition should be added to the WHERE clause. If the query has ORDER BY column_0, column_1, column_2, this condition should look something like this:
(column_0 > value_0 OR (column_0 = value_0 AND
 (column_1 > value_1 OR (column_1 = value_1 AND
  (column_2 >= value_2)

Of course, swap ">" for "<" if your ORDER BY columns are using DESC. Be sure to avoid SQL injection in the values.

Internal data structures

  • Query API has had very successful structure of one global nested array() structure passed around. Various modules would add pieces of data to many different points of that array, until, finally, it would get rendered for the client by one of the printers (output modules). For the API, we suggest wrapping this array as a class with helper functions to append individual leaf nodes.

Error/status reporting

For now we decided to include error information inside the same structured output as normal result (option #2).

For the result, we may either use the standard HTTP error codes, or always return a properly formatted data:

Using HTTP code
void header( string reason_phrase [, bool replace [, int http_response_code]] )

The header() can be used to set the return status of the operation. We can define all possible values of the reason_phrase, so for the failed login we may return code=403 and phrase="BadPassword", whereas for any success we would simply return the response without altering the header.

Pros: It's a standard. The client always has to deal with HTTP errors, so using HTTP code for result would remove any separate error handling the client would have to perform. Since the client may request data in multiple formats, an invalid format parameter would still be properly handled, as it will simply be another http error code.

Cons: ...

Include error information inside a proper response

This method would always return a properly formatted response object, but the error status/description will be the only values inside that object. This is similar to the way current Query API returns status codes.

Pros: HTTP error codes are used only for the networking issues, not for the data (logical errors). We do not tied to the existing HTTP error codes.

Cons: If the data format parameter is not properly specified, what is the format of the output data? Application has to parse the object to know of an error (perf?). Error checking code will have to be on both the connection and data parsing levels.

Boilerplate code

Simple API module

class Api<module name> extends ApiBase {
	public function __construct( $main, $action ) {
		parent::__construct( $main, $action );

	public function execute() {

	public function getAllowedParams() {
		return array(
			'<parameter name>' => array(
				ApiBase::PARAM_TYPE => array( 'foo', 'bar', 'baz' ),

	public function getParamDescription() {
		return array(
			'<parameter name>' => '<parameter description>',

	public function getDescription() {
		return '<Module description here>';

	public function getExamples() {
		return array(
			'api.php?action=<module name>&<parameter name>=foo'

	public function getHelpUrls() {
		return '';