Topic on Talk:Requests for comment/Localisation format

Omission of non-message keys

5
Tim Starling (talkcontribs)

What is the reason for the omission of keys other than $messages from the new format? We already need to deliver non-message keys to JavaScript code, for example, we have mw.language.getData() and callers such as convertNumber(). Presumably these needs for non-message data will continue to increase. The fact that you're following an example from jQuery.i18n does not seem like a good excuse, given the much larger scope of this proposal.

How will existing non-message keys in extension message files be handled if $wgExtensionMessageFiles is deprecated?

Siebrand (talkcontribs)

On $wgExtensionMessageFiles: We chose for a limited scope ($messages only) to keep things manageable and increase chances of being able to implement this within a limited amount of time. Increasing scope, would create longer discussion, and increase chances of small issues in the periphery holding up the main change. I acknowledge that we need to find a solution for special page aliases and namespace names later. This might be done by extending the JSON format to support (associative) arrays in a similar fashion as currently done for the @metadata key.

On one $messages for core and language classes: Where possible we would like to change our currently "manually maintained" i18n related configurations for languages to data driven implementations based on collections by 3rd parties. The plural rules are an example of that. Future opportunities are in date/timezone information, number formatting, etc. Our plan is to address these things one feature at a time, to not create a big plan that has decreased changes of being implemented, because of complexity or a lack of resources.

Tim Starling (talkcontribs)

I'm not really buying that. To a large extent, how you do messages determines how you do non-message data, so you should write both on the RFC so that we can discuss them both with clarity. This RFC has clear implications for non-message keys, and I'm not sure if I support those implications. It's not necessary to implement the whole RFC in one go, so it only extends the scope of the design work, not the implementation work.

Nikerabbit (talkcontribs)

I'm not convinced that we need to increase the scope of this RFC. I see no issues for keeping non-message data as they currently are. For now it is actually a benefit since we are separating the messages, which are (mostly) machine maintained, from the non-messages, which are maintained by hand. This alone simplifies message handling in core and at translatewiki.net

In my opinion LocalisationCache is currently nicely abstracting away where and how we store the data, although it probably wasn't the issue you had in mind when writing LC.

As James proposes below, there is pretty straightforward mapping from PHP to JSON for non-messages, if we want to do it. For now there is no compelling reason to change how we do it. We would have to consider the implications about future decisions about the extent we are going to use 3rd party language data. And if we go into motivations like using common format for backend and frontend, then we are again in scope of another RFC in progress about frontend i18n.


How will existing non-message keys in extension message files be handled if $wgExtensionMessageFiles is deprecated?

On this I think we should clarify the RFC (assuming there is agreement) that we would only be deprecating that variable for messages for now. Magic words and aliases, which are already in separate i18n files for majority of extensions because translatewiki.net forces that, would keep working until we decide to do something about them.

Jdforrester (WMF) (talkcontribs)

In general, we should probably just use @-prefixed strings, objects or arrays as necessary – specifically[1]:

$namespaceNames
"@namespaceNames": { "NS_MEDIA": "Média", "NS_SPECIAL": "Spécial", … }
$namespaceAliases
"@namespaceAliases": { "Discuter": "NS_TALK", "Discussion_Utilisateur": "NS_USER_TALK", … }
$specialPageAliases
"@specialPageAliases": { "Activeusers": [ "Utilisateurs_actifs", "UtilisateursActifs" ], "Allmessages": [ "Messages_système", "Messages_systeme", "Messagessystème", "Messagessysteme" ], … }
$magicWords
"@magicWords": { "redirect": [ "0", "#REDIRECTION", "#REDIRECT" ], "notoc": [ "0", "__AUCUNSOMMAIRE__", "__AUCUNETDM__", "" ], … }
$bookstoreList
"@bookstoreList": { "Amazon.fr": "http://www.amazon.fr/exec/obidos/ISBN=$1", "alapage.fr": "http://www.alapage.com/mx/?tp=F&type=101&l_isbn=$1&donnee_appel=ALASQ&devise=&", … }
$linkTrail
"@linkTrail": "/^([a-zàâçéèêîôûäëïöüùÇÉÂÊÎÔÛÄËÏÖÜÀÈÙ]+)(.*)$/sDu"[2]
$dateFormats
"@dateFormats": { "mdy time": "H:i", "mdy date": "F j, Y", … }
$defaultDateFormat
"@defaultDateFormat": "zh";
$datePreferences
"@datePreferences": [ "default", "ISO 8601" ]
$separatorTransformTable
"@separatorTransformTable": { ",": "\xc2\xa0", ".": "," }
$separatorTransformTable
"@separatorTransformTable": { ",": "\xc2\xa0", ".": "," }
$linkTrail
"@linkTrail": "/^([a-zʻʼ“»]+)(.*)$/sDu";
$linkPrefixCharset
"@linkPrefixCharset": "a-zA-Z\\x80-\\xffʻʼ«„";
$linkPrefixExtension
"@linkPrefixExtension": "true";
$fallback
"@fallback": "zh-hans";
$fallback8bitEncoding
"@fallback8bitEncoding": "windows-1252";

I've probably made this too simple to work, on reflexion – have I missed something?

  1. Examples from MessagesFr.php, MessagesUz.php, MessagesZh-hans.php,
  2. Or do we want to convert this from PHP format somehow?
Reply to "Omission of non-message keys"