Talk:Translatable modules

From mediawiki.org
Jump to navigation Jump to search

Feedback from Base[edit]

It seems that what Yurik has presented might as well be the solution. Just have those .tab JSON files in MediaWiki format instead, where there is 1 file per language, rather than 1 file with all the languages. Then that one can be tagged for translation and the resulting /ru, /es, /fr pages can be attempted to be called by the module before it falls back to /en. Now the only question is whether <translate> tags would work in pages with JSON content module. Or perhaps Translate can be fed those automatically somehow. There is also an issue of escaping to be solved, lest people break those with a stray quote mark, but it is still better than the comma to remember. --Base (talk) 14:43, 2 September 2020 (UTC)

So just in case I mean creating something like Template:Graphs/i18n.tab with content like

"en": <translate>Translation table for the corresponding template</translate>",

Now probably indeed that won't work out of box, since even if Translate picks it, it will attempt to create Template:Graphs/i18n.tab/en and so forth, which does not end with .tab so won't work even if gets created. But I think it should be an easy fix to be made. --Base (talk) 14:47, 2 September 2020 (UTC)
I think the best way would be to add support to translatewiki to directly understand the Data:I18n/*.tab pages. Most templates have very few parameters, so creating guzzillion pages is not really necessary. The extra tags in the raw data may make the system far more complex and less understandable. Note that it should be fairly straightforward to fully automate this process. --Yurik (talk)
Well, the messages themselves will end up as individual pages in Translations namespace this way or the other, be it on Commons or on Translatewiki, so while we can save some on some number of pages, the biggest contribution to that number stays, which renders this saving questionable. But I guess yeah if the process of Translatewiki is automated, then it should be a good way to go. I am only a bit concerned if it would be clear for people that they have to go to Translatewiki to do the translations (which is a separate non-SUL wiki too). --Base (talk) 15:07, 2 September 2020 (UTC)
Fair point. I guess more important than pages is that Module:TNT has been around for a long time and many people are familiar with it, so keeping that approach might make sense. Putting a single translation in a data subpage would require a new module, and will make things a bit slower (it would need to load the primary page and a fallback, resulting in a slowdown). It does fit better with the current translatewiki model though. --Yurik (talk) 15:15, 2 September 2020 (UTC)
Translate tags only work on wikitext pages, and besides, it would be bad usability when we can easily do better and support tab-json pages directly. My main question would be what would be the best workflow: some kind of automatic processing or manual processing similar to translatable pages where you have to (re-)mark pages to 1) register in the system and 2) enable change tracking and fuzzying. Nikerabbit (talk) 08:19, 3 September 2020 (UTC)
Nikerabbit workflow-wise, I think **any** page with the i18n/ and templatedata/ prefixes are ok to automatically add to the translation system. In theory you could say that any page at all with a multilingual string could be added, but at least these two pseudo-namespaces have well established format. Moreover, ideally the translation should be on commons itself, i.e. any edit would go directly into the data page as a regular edit, rather than having an intermediary storage, but with all the benefits of the great translatewiki interface (suggestions, autotranslations, viewing all data pages on a single page for a single language, etc.). If this is not possible (?), the next best thing would be to auto-sync as fast as possible any changes on either side. --Yurik (talk) 22:34, 3 September 2020 (UTC)

Data: on Commons[edit]

In the Data namespace on Commons, there are json files with translateble strings, example: c:Data:COVID-19 hospitalizations in Denmark.tab. It would be great if these also became easier to translate since it would make it much easier to maintain and share data in tables across language versions. Ainali (talk) 09:56, 4 September 2020 (UTC)

Ainali, thanks for the comment!
This may be possible, but I have a few comments and questions:
  • I first need to emphasize that storing the translatable strings in JSON .tab files in the data Commons is just one of the proposed solutions and it's not necessary the one that will be chosen.
  • What strings exactly in these files? "COVID-19 hospitalizations in Denmark, including numbers for patients in intensive care units and in critical state."? "Patients in ICU"? "Aggregated from various media source"?
  • Where will they be used?
I can certainly imagine a general use case of JSON files stored as wiki pages, and usable in modules, in templates, and in other places, and translatable in the Translate interface. It must be well-defined how it's done, however. For example, how will the Translate extension know which files to load for translation? It's doable, but needs specification. "Aggregated from various media source" in your example above doesn't appear to be translatable, not even manually, although I might be missing something.
The scope of this project is only modules, but if it can cover other things without a lot of extra effort, then it's conceivable. --Amir E. Aharoni (WMF) (talk) 13:06, 6 September 2020 (UTC)
If you enter edit mode of the file you can see that there are already English, Danish and Swedish translations in it. And Commons actually displays the strings in my language already. They can be used in articles thanks to templates like w:sv:Mall:Json2table. Ainali (talk) 17:11, 6 September 2020 (UTC)

JSON @ commons:Data seems to be the best approach seen yet[edit]

  • Every Lua module can access them.
  • Every template can access them.
  • All TemplateData implementations can use such strings, if built by template and parser functions rather than explicit extension element.
  • All JavaScript gadgets can access them easily and incorporate the object.
    • It is no big deal for a gadget to retrieve JSON page contents and turn that into JSON object, then use what is useful.

Feeding the JSON entries might be supported from translatewiki community.

  • Every week, on regular dissemination schedule, translations available at translatewiki might be added to the commons:Data/JSON pages and add new strings.
  • Some mechanisms or rules should be found to avoid conflicts between manual page editing and translatewiki import, e.g. struggle about a particular wording.
  • The related commons:Data pages should be protected from normal editing (which is also protecting against world wide vandalism), and may be modified by dissemination agent only. Meta information about most recent update might be included. Changing a translation is possible via translatewiki only for such modules, and trusted translators and approving and reviewing procedures at translatewiki may be used. They are working pretty well for a dozen years now. A modification will not cause global damage immediately, but there are some days left for checking and reverting.
  • Searching tools on translatewiki may be used to find missing translations. If I am Japanese and feel less occuppied, I might seek for missing translations in the entire namespace, whichever thing I might find.

There are proven mechanisms available for all purposes, and a unified workflow for all kinds of tools including templates may use them. No need to re-invent the wheel.

  • The system message approach is less helpful here. They are designed for MediaWiki software itself, mainly in PHP as skins, extensions and special pages. This is one central monolithic block with global maintainers.
  • Gadgets, Modules, Templates, TemplateData are distributed applications with many maintainers and many independent implementations. One JSON object (page) per application is a reasonable bundle for all messages a gadget or module might need, and there are no conflicts with other messages for other packages. It is easy to trace back from the bundle of all messages to the requesting application.
    • While each application should use one basic JSON object, there may be subtasks as subpages extending the messages under particular circumstances.
    • Naming of “things” should use global registry for identifiers. Each whatever may reserve one ID. The JSON page name as a sub page of some I18N repository is the unique identifier of the application, and might introduce further subpages if needed.

Greetings --PerfektesChaos (talk) 16:34, 22 September 2020 (UTC)

Thanks!
My idea is that we should probably not use the translatewiki.net website, but the installation of the Translate extension on the websites here. Commons, Wikidata, mediawiki.org and Meta already have it and it's used a lot. The Translate extension will have to be modified to support the new format.
And yes, the idea is not to reinvent the wheel, and to reuse existing practices and technologies as much as possible. --Amir E. Aharoni (WMF) (talk) 06:37, 23 September 2020 (UTC)
Some notes on global application identifiers:
  • The package identifiers are to be unique for both gadgets and modules/templates.
  • They share the same name space for CSS selectors.
  • They should use the same key for package translation as for other global administration issues.
  • One package may consist of a template and a lua module. Both are contributing to wikitext content generation.
    • The Lua module might be the back office of a template which is presented to regular authors for convenience. See e.g.: w:de:Template:Literatur
  • While a package is sharing the same global identifier, there might be different aspects represented in the commons:Data page name:
    • I18N/TemplateData:Softredirect
    • I18N/Module:Softredirect
      • If any, or might use the related template transclusion messages.
    • I18N/Template:Softredirect
      • Containing things presented in transclusion.
{ explain:  { en: "This page is a [[meta:Soft redirect|soft redirect]].",
              de: "Diese Seite ist eine „[[w:de:Hilfe:Weiterleitung#soft|weiche Weiterleitung]]“.",
              fr: "Cette page est une [[meta:Soft redirect/fr|redirection douce]].",
              hu: "[[meta:Soft redirect/hu|Soft átirányító]] lap",
              it: "La pagina di [[meta:Soft redirect/it|riferimento]] si trova in un altro sito/progetto.",
              la: "Haec pagina te [[meta:Soft redirect|ad locum supra adnexum]] dirigere vult.",
              nl: "Dit is een [[meta:Soft redirect/nl|indirecte doorverwijzing]].",
              ru: "Эта страница — [[meta:Soft redirect/ru|мягкое перенаправление]]."
  },
  notarget: { en: "missing target",
              de: "Ziel fehlt" },
  syntax:   { en: "no link syntax",
              de: "Keine Wikisyntax" }
}
Globally unique package identifiers are to be human readable and self explaining; neither 63A9F2B70C41D853 nor P26375.
Management of globally unique identifiers is a crucial prerequisite to access any translated string later.
Enjoy --PerfektesChaos (talk) 10:27, 23 September 2020 (UTC)

I agree that this approach seems to be the most convenient one. However, from my testing I remember that the Data call can lead to performance issues on a larger scale (discussed it here last year). As long as we are only talking about user interface messages, this will probably not be an issue. But if this solution were to be used as the basis for global modules/templates in the future, I would like to see some statistics on performance first (in comparison to the other solutions presented here). Regards, XanonymusX (talk) 17:52, 30 September 2020 (UTC)