Requests for comment/Localisation format

This is an RfC on a new localisation format and structure for MediaWiki core and extensions.

Rationale
The current…

Representation in resource loader for i18n files is out of scope of this RFC. See …

Requirements
Requirements are at least the following:
 * 1) Have the messages in a non-executable UTF-8 file format.
 * 2) The scope is only limited to   for now.
 * 3) Core and extensions should be able to have multiple groups of messages, located in standard structure (for example,  ).
 * 4) All current MediaWiki i18n features must keep working.
 * 5) The current extension localisation format should keep working  for backward compatibility purposes.
 * 6) The new format will be mandatory for all Wikimedia deployed code.
 * 7) Conversion scripts should be made available, and be done for all Wikimedia-deployed extensions.

Proposal
The proposal is as follows
 * We will use UTF-8 JSON files using the same syntax to that used by jQuery.i18n, https://github.com/wikimedia/jquery.i18n. This provides complete support for  and basic authoring information, but does not support all the edges of the MessagesEn.php files, which will remain (replacing them is out-of-scope for this RfC).
 * TODO: The JSON translation files will be stored with actual human-readable Unicode characters and not Unicode character numbers. Assuming that this is not a problem for loading in JavaScript, parsing in different browsers, etc.
 * Extension messages files will be expected in general to be in a directory called  in the root of the extension's repo, split by language (,  ,  , etc.), with support for multiple sub-directories for splitting up into keyed groups, auto-loaded.
 * Groups' keys are locally-scoped, allowing only, and auto-generated from the name of the sub-directory (if appropriate); the primary group will be automatically named the same as the extension. If you want to refer to default group distinctly from the extension's whole set of modules, refer to it outright (named other than  ).
 * We will introduce a new configuration variable,,  which allows the directory/directories for the extension to be loaded.   will continue to work but will be ignored where both are specified.
 * Implementation will be completed in advance of the release of MediaWiki 1.23, running in dual-support mode, with support for the old format being retired in MediaWiki 1.24. As MW 1.23 will be an LTS release, this will ensure that users of existing extensions will be able to use them unaltered for several years.
 * Write conversion script to convert from PHP arrays to JSON files, and run it on all Wikimedia-deployed extensions. Complex splits (e.g. for MediaWiki core and VisualEditor) will be done manually by the Language Engineering and VisualEditor teams.
 * TODO: What exactly is the proposed new structure for MediaWiki core's messages?
 * TODO: Will we get rid of messages.inc of core?
 * TODO: jquery.i18n json does not currently support fuzzy tags — is this OK to no longer have?

Examples
An extension with multiple groups of messages might choose to split its files up as follows:
 * This will be loaded as wikimediamessages
 * This will be loaded as wikimediamessages/cclicensetexts
 * This will be loaded as wikimediamessages/wikimediatemporarymessages
 * This will be loaded as wikimediamessages/cclicensetexts
 * This will be loaded as wikimediamessages/wikimediatemporarymessages
 * This will be loaded as wikimediamessages/wikimediatemporarymessages

The extension would load its values using. This is short for, and would load all message files in the above example automatically, including the namespacing.

A more complex use case is for extensions that use libraries, where moving the internationalisation files into the root of the extension would split the import and make things more complicated. A (slightly artificial) example might be:

For this example, the messages files would be loaded using: