Topic on Talk:Requests for comment/Localisation format

A quite popular case...

10
MaxSem (talkcontribs)

An extension with no UI and thus no messages except for descriptionmsg. 350 files with one message each. Almost the same stuff for extensions with very little messages.

Jdforrester (WMF) (talkcontribs)

… so? inodes are cheap.

Parent5446 (talkcontribs)

Well if we are following the jQuery.i18n specification, then it should be supported to put all messages in one file (just like how it's done now, except in JSON). See the bottom of this section for more info.

NEverett (WMF) (talkcontribs)

I believe the proposal is for the contents of the files to follow jQuery.i18n but to require one message per language. The 350 files with one message argument isn't enough to convince me that supporting the all messages in one file format is required. I actually think a strong argument might be "lets just copy jQuery.i18n 100% so we don't have to tell people it is just like it _but_." Even with that I still favor one file per language from a performance and simplicity perspective.

TheDJ (talkcontribs)

True, but the one language per file could be a convention, instead of a technical rule of our logic to read the datafilee of course. Or you could add another convention and say that people should use qall.json if they are putting all languages in a single file, if you want to still enable this, but not allow people to put multiple languages in a single file.

Siebrand (talkcontribs)

I think we should choose to go for putting each locale in a separate file. This reduces code complexity.

Seb35 (talkcontribs)

I like the JSON format, but I am a bit skeptical about the per-language files, although I also understand heavy files could be not practical.

As a translator, I sometimes open the i18n.php file with `vi' to search all translations of a message or navigate between the original English and other languages; if they are in separated files, it’s still possible but less practical (e.g. use `grep -R' or `cat *.json | vi -').

Another minor point is the maintenance of all languages could be possibly more difficult. I am thinking about multiples lines in Gerrit in the localisation updates (example), or mass operations as removing a message.

These are minor points but are also linked to the usability for the developers and other people looking the code.

Jdforrester (WMF) (talkcontribs)

On your points:

Using .i18n.php directly for translation
The main workflow for translators is translatewiki.net, where the English original (and qqq message) are clearly shown alongside the input box – I don't know if many people are manually paging through the i18n.php file for the English original. However, I agree that wanting to know how a message was translated into a related language (e.g. French and Spanish, or German and Swiss German) for consistency is probably a useful artefact we might want to suggest to TWN as a translation tool?
Multiple files touched in one git commit
I see this as a feature, not a bug – as someone who manually reviews every single VisualEditor git commit from the localisation update, it's now much much easier for me to see at a glance what languages got new messages or updated messages, and particularly when a language is newly added.
Removing a message across several files
The official guidance on this is that only the -en message should be removed by the developers, and the system will remove the non-en messages, so I don't think this is a problem here.
Seb35 (talkcontribs)

Thanks for your replies. Your convinced me for the two last items. For the first, I continue to prefer navigating in a single file when reading the translations on my computer, but I understand developers could prefer splitting them. For an interface for comparing the languages, there exists one, but I’m rather a command-line-addict when possible.

Reply to "A quite popular case..."