Help:Extension:Translate/API

WORK IN PROGRESS

Introduction
The Translate extension is a complex extension. Some major functionality groups in it are:


 * creation, modification and deletion of different kinds of messages groups
 * parsing and generating various file formats
 * loading a collection of messages from the database
 * translation aids like translation memory
 * translation editor
 * message mass processing capabilities (import, export, fuzzying)
 * statistics collection and presentation
 * page translation (splitting and generating pages, creating message groups)

The functions are accessed either directly via PHP code (Internal-API) or via the web facing action API framework (Action-API). There are also other web facing entry points that do not use the Action-API framework (Special-API) and command line scripts (Cli-API). It is in principle possible to modify the database tables or files directly, but that is not recommended nor aim to be supported.

Users access the functionality via dedicated interface on MediaWiki itself, which consists of special pages and added/removed/modified interface elements on other pages, added via various hooks that MediaWiki provides. Some of the interface code calls the Action-API to execute its actions, providing ideal separation of concens with the downside that Action-API always needs JavaScript support on the browser (but it can also be used from other software without this issue), necessitiating Special-API kind of alternative for users without JavaScript. Some of the newest features do not fallback gracefully when JS is not available.

The order of importance to third party developers is assumed to be Special-API, Cli-API, Internal-API and Action-API. I will present in this order what is already available, how flexible they are and how well they are documented. After that I will switch to functional inspection which includes features that are not yet available at all, and on which layers they should be.

Cli-API
Command line scripts are suitable for executing time consuming operations, like exporting translations into files or bootstrapping translation memory database. They are not suitable for small actions due to (1) difficulty of passing parameters (2) no standard way of doing error reporting and (3) startup overheard gets significant when doing thousand of actions. Cli-API is primarly intented for system administrators, advanced translation administrators and automatic tasks.

Each command line script has --help switch that tells what the command does and what are the parameters. There is general help somewhere how to run command line scripts, but not special help for Translate extension, though it doesn't differ from MediaWiki itself, but finding out that information might be a problem for new users.

There is no detailed overview of the command line scripts in Translate extension. The following list can be used as a staring point.

Maintenance scripts


 * createCheckIndex.php - Runs message checks for each message updating the database state
 * createMessageIndex.php - Message index is needed internally by Translate and it is not always regenerated automatically
 * messadeDust.php - Find unused messages or messages in wrong place
 * populateFuzzy.php - Updates the database fuzzy tag status for each messages
 * ttmserver-export.php - Fills the translation memory with current translations

Message group related


 * export.php
 * fuzzy.php
 * magic-export.php
 * processMessageChanges.php
 * sync-group.php

Sysadmin stuff


 * logfilter.php - for sysadmins
 * migrate-schema2.php - database schema upgrade
 * pagetranslation-check-database.php - database consistency check script
 * list-mwext-i18n-files.php - needed for MediaWiki i18n files for some time yet

Stats


 * languageeditstats.php
 * groupStatistics.php

Tests (should be moved to unit tests)


 * pagetranslation-test-parser.php
 * yaml-tests.php

Other


 * plural-comparison.php
 * cldr-plural-to-yaml.php

A very limited set of functionality is accessible via cli scripts.

Special-API
Almost all of the user facing functionality is primarly exposed through the special pages with some extensions done as command line scripts, due to restrictions of web environment. Special pages interfaces doesn't really constitute an API that third party users can rely on. Using web pages programmatically is called screencraping and that is highly discouraged. Having good API coverage that is well documented and advertised helps to avoid others to rely on screenscaping, which might cause a compatability problem which refactoring the special pages.

Very high level overview of the special pages:


 * SpecialAggregateGroups.php - Message group management
 * SpecialImportTranslations.php - Message group management
 * SpecialManageGroups.php - Message group management
 * SpecialMagic.php - Very little relevant for 3rd party use
 * SpecialMessageGroupStats.php - Statistics
 * SpecialSupportedLanguages.php - Statistics
 * SpecialTranslations.php - Statistics kind of
 * SpecialLanguageStats.php - Statistics
 * SpecialTranslationStats.php - Statistics
 * SpecialMyLanguage.php - Page translation related
 * SpecialPageTranslation.php - Page translation related
 * SpecialTranslate.php - The main translation interface, including the editor
 * SpecialFirstSteps.php - Translatewiki.net specific signup page

Special pages expose only very little information and functionality, very often not enough to do alternative implementations. Special pages have quite good user documentation scattered around different help pages of Translate extension in mediawiki.org. The calling conventions of them are not documented and wont be documented. One specific case is the, or actually, are the translation editors, which are provided as HTML code by Special:Translate in a way that is highly tied to rest of the Translate extension.

Internal-API
Internal API offers access basically to all functionality what Translate extension has - and I'm not going to repeat that. But there are certain ways and patterns how the internal api is made available and intented to be used.

Hooks. Hooks are way for programmers to execute their own code in predefined places decided by the Translate extension authors. Hooks provide well-defined injection points - at least in principle. Often the relevant data is not passed to the users code, or the parameters are highly tied to the current implementation and those might break when the implementation is refactored. Translate extension uses many hooks of MediaWiki, but only provides few new hooks itself which are currently only used internally by the extension. Because the extension code itself can access everything it needs, hooks aren't usually added except in obvious cases, or when somebody requests them. The hooks in Translate extension are not documented anywhere. The most important hook that currently exists is the hook which can be used to register new message groups.

Interface and subclasses. Another way to extend the Translate extension code is to write classes that implement some pre-defined interface (in general sense). Translate extensions currently has four interfaces (specific sense):


 * iTTMServer
 * StringMangler (making sure the message keys conform to limitations of wiki titles)
 * MessageGroup
 * FFS (file format support)

The interface defines what methods the user's code must implement, what parameters they get and what they return (not enforceable in PHP). The most useful interface currently is the FFS interface that defines how a file format parser and generator is called. Also MessageGroup interface is common use.

But just providing an interface is not all we can do. Translate extension comes with many classes that implement those interfaces, like SimpleFFS, FileBasedMessageGroup, AggregateMessageGroup and WikiMessageGroup. The user code and extend these classes, just overriding specific parts to implement the behaviour the user wants. This sometimes also works for classes that don't implement any predefined interface (specific sense). It should be noted that the Translate extension must be coded in a way that expects different classes by providing way to register those custom written classes and by providing a way to indicate in some configuration which class to use. The YAML based message group configuration allows registering new classes and defining which class to use for MessageGroup, FFS and StringMangler.

In theory one could also replace one of the core classes with custom class, but this is not recommended nor supported. The target should be that the core classes are extensible enough with hooks, and things like FFS use interfaces and subclassing.

The interfaces and classes are reasonable well documented via code documentation. There could be an overview interfaces and tutorials for them.

Action-API
Since Translate extension is often provided as service and it acts as the interface (general sense) for the data it contains, being able to call that service over web (in this case http(s)) is of utmost importance.

Here's a list of currently available API modules:


 * ApiAggregateGroups.php - Creation and deletion of aggregate groups and defining their subgroups (write only)
 * ApiGroupReview.php - Changing of message group states (write only)
 * ApiHardMessages.php
 * ApiQueryLanguageStats.php
 * ApiQueryMessageCollection.php - Querying of message of a group, provides filtering and stuff (read only)
 * ApiQueryMessageGroups.php - Information about message groups (read only)
 * ApiQueryMessageGroupStats.php - Statistics of message groups (read only)
 * ApiQueryTranslationAids.php
 * ApiQueryMessageTranslations.php - Special:Translations equivalent (read only)
 * ApiStatsQuery.php
 * ApiTranslateSandbox.php
 * ApiTranslateUser.php
 * ApiTranslationStash.php
 * ApiTranslationReview.php - Reviewing of translations (write only)
 * ApiTTMServer.php - Querying of translation memories (read only)

These are somewhat documented when you access http://translatewiki.net/w/api.php, e.g. |query+messagegroups|query+messagegroupstats|query+messagetranslations|translationreview|ttmserver|groupreview, though we understand it's hard to get the high level picture of what each module does.