Internationalisation wishlist 2014

The following localisation/internationalisation wishlist for MediaWiki and Wikimedia is a brainstorming authored mainly by Niklas Laxström in March 2014. It does contain many items from the previous year and some of the items have been or are listed as possible GSoC/OPW projects.

php.i18n
Like jquery.i18n, there are many php projects that would benefit from high quality i18n library they could just plug in. At translatewiki.net we have multiple PHP projects. Licensing issues might be a problem if we want to reuse code from MediaWiki.

TUX for statistics
Special:LanguageStats and Special:MessageGroup stats look outdated compared to TUX editor and need a face lift too. In addition we could make them Web 2.0 compliant and make them faster with AJAX by not loading all information immediately.

Repository management
We handle source code repositories with a handful of shell scripts. This leads to lots of code duplication and new repository configuration wastes time. Repositories should be managed by a PHP library or equivalent and configuration should be similar to how we configure message groups. The configuration can be stored in files or even inside the wiki itself.

Fully automated exports
We have better things to do than run shell commands. To achieve this we need to increase the reliability of exports, by adding more automatic checks that the output looks sane. We also need a better repository management.

Gadget localisation
Integrate jquery.i18n and Translate to provide localisation facilities for gadgets. We can start doing this even before Duke Nukem Forever Gadgets 2.0 happens. Would help with: Multilingual Commons.

jquery.i18n for MediaWiki
It makes no sense for us to main two i18n libraries written in JavaScript. Make jquery.i18n good enough and make MediaWiki use it.

Complete solution for frontend i18n
Collaborate with Globalize.js, cldr.js and other projects and ensure our jquery i18n projects works well with those with minimal overlap. A complete solution includes everything like time and number formatting, localisation file formats, message delivery, message formatting, input methods, web fonts, language selection.

Translation memory on the wheels
Let's gap a bit the hole between plain TM and MT and do something more useful than edit distance. Try to get the ElasticSearch-as-TM project to fly and promote it. It could be *the* open source TM solution to plugin into your software, given that there doesn't seem to be any competition at the moment. Use sentence alignment to increase the recall of the translation memory.

MediaWiki localisation updates for all

 * ''See also Generic, efficient Localisation Update service, Extension:LocalisationUpdate/LUv2

Nobody updates MediaWiki regularly. LocalisationUpdate should be available and functional on all MediaWiki installations. Without needing to set up cronjobs. See also next item.

LocalisationUpdate service provided by TWN
Provide an efficient API for any product to automatically update their translations live.

Release TWN TM data
We can either provide dumps in TMX or some other format or provide it as a service. The service could be registration-only to make sure we can handle the load.

Glossaries in TWN
There must be a lot of glossaries and terminologies out there. Some of them must be useful to integrate in translatewiki.net.

Glossary building tools
Provide technical support for building glossaries with Translate extension and  in translatewiki.net. These should directly integrate into the translation editor.

Multilingual SMW

 * ''See also Multilingual SemanticMediaWiki

Semantic MediaWiki is cool, but it would be even cooler if it was multilingual-cabable out of the box. Integrate it with Translate extension.

Multilingual search
The portal of each project, like wikipedia.org, should have a multilingual search, automatically returning results in the most relevant language; see also bug 1837. Such multilingual search could then be expanded to Special:Search of each wiki and triggered either automatically, or as fallback (see also above), or as another option/profile/whatever.

Wiki page translation advanced issues
Though mentioned in other points, people have specifically requested help for easy handling of multilingual categories and templates with Translate page translation. Would help with: Page translation migration/adoption tools, Multilingual Commons, MediaWiki multilingual documentation.

MediaWiki multilingual documentation
Most of the help pages and manuals are on Meta or rather on local Wikipedias: this leaves most small Wikipedias and other projects in "small languages" with no docs or outdated docs. Everything should be central, translatable, easily and equally available from all wikis. Have a policy to make translatable user-faced documentation at mediawiki.org

Page translation migration/adoption tools

 * ''See also Tools for mass migration of legacy translated wiki content

Translate needs a bigger userbase to support our wikis at best, but existing wikis don't adopt it because there is no support at all to import old translations or to handle templates, categories etc., and doing everything manually is not an option.

Multilingual Commons
Help commons to become truly multilingual. Content, categories, templates, gadgets etc. should be translatable.

Deploy TranslateSVG
Let's not let this project rot away, we should get it into shape and deploy it.

Page content language
Provide a mechanism to set the page content language in MediaWiki core. This is a blocker to often requested features of Translate.

Big projects at translatewiki.net
Lately the number of active translators at twn has not grown. We should try to get big projects like KDE to lure in more translators. Would help with: Glossaries in TWN, Promote i18n best practices.

Translate evangelisation
Alternatively, convince people like the FSF to adopt MediaWiki+Translate for the translation of their software with as little quirks as possible.

Better support for formal-informal variants
Either make these, too, inline or make it possible to mark messages which do not need overriding. See for example 52957.

Use Translate for translating subtitles
After we have conquered SVGs, why not do the same for subtitles of videos in Commons. Would already be trivially possible if TimedText used subpages. Would help with: Multilingual Commons.

Semantic Translate
Lots of information could be made available via semantic properties or other means. Some examples: Would make it simple to build queries like "Top translators for MediaWiki in Finnish in 2013".
 * How many/who have reviewed a translation
 * Who has translated a message
 * Language of the translation

Evaluate Translate features: spring cleaning
Translate has grown during the years. It's time to look critically what is that we actually support, what parts of that we still want to support, what needs more modularization and what should be dropped to reduce maintenance costs.

Near-real time translation collaboration
When multiple people are translating same group (say a translatable page), it would be easier to see the updates they do live, something akin to etherpad. It doesn't need to support multiple people editing the same message at the same time. Even seeing what messages are open (and their content) would help. Credit: neverendingo.

By popular vote
Wiki editing and discussion doesn't work with many users not agreeing on what's best. A quick list of alternative (past/proposed/used elsewhere) translations with relevant info would help find what's consensual. See for instance fundraising messages which in theory should gradually approach perfection but in reality only see things changed back and forth continuously, with the worst translations usually surviving. Situation is made worse by: a) history not being easy to reach from translation editor, b) discussion even less, c) activity and discussion around same translations in other messages being completely lost, d) people never finding where the message is (especially from other wikis as with banners). See also 45831 – Discussion about translation. Would help with: What happened to my translation?

What happened to my translation?
New translators in particular want to easily have feedback on their translations and what happened to them. Watchlist (for accept log and modifications) and contributions (for a mere list) are not enough, especially for unlogged/separate actions like setting workflow state, pushing to CentralNotice, copying to another wiki, exporting to a CVS. Credit: Gloria_S.

Feed of exports from translatewiki.net
Provide a mechanism for people to see when we do exports. Could be RSS, twitter, IRC, whatever makes sense. Perhaps also imports. Main benefit would be transparency for what and how much we do. Going further, we could send out notices to translators "Your translations are now visible to users". Would help with: What happened to my translation?

There is new things to translate
Send out notices to translators that there are new stuff to translate for them.

Visual translation: Integration of page translation with Visual Editor
The wiki page translation feature of the Translate extension does not currently work with Visual Editor due to the special tags it uses. More specifically, this is about editing the source pages that are used as the source for translations, not the translation process itself. The work can be divided into three steps:
 * 1) Migrate the special tag handling to a more standard way to handle tags in the parser. This need some changes to the PHP parser for it to be able to produce wanted output.
 * 2) Add support to Parsoid and Visual Editor so that editing page contents preserves the structures that page translation adds to keep track of the content.
 * 3) Add to Visual Editor some visual aid for marking the parts of the page that can be translated.

This can be a difficult project due to complexities of wikitext parsing and intersecting multiple different products: Translate, MediaWiki core parser, Parsoid, Visual Editor.

Extensive and robust localisation file format coverage
Translate extension supports multiple file formats. The formats have been developed "as needed" basis, and many formats are not yet supported or the support is incomplete. In this project the aim would be to make existing file formats (for example Android xml) more robust to meet the following properties: Example known bugs are 31331, 36584, 38479, 40712, 31300, 57964, 49412.
 * the code does not crash on unexpected input,
 * there is a validator for the file format,
 * the code can handle the full file format specification,
 * the code is secure (does not execute any code in the files nor have known exploits).

In addition new file formats can be implemented: in particular Apache Cocoon and AndroidXml string arrays have interest and patches to work on, but we'd also like TMX, for example. Adding new formats is a good chance to learn how to write parsers and generators with simple data but complicated file formats. For some formats, it might be possible to take advantage of existing PHP libraries for parsing and file generation. (More example formats other platforms support: OpenOffice.org SDF/GSI, Desktop, Joomla INI, Magento CSV, Maker Interchange Format (MIF), .plist, Qt Linguist (TS), Subtitle formats, Windows .rc, Windows resource (.resx), HTML/XHTML, Mac OS X strings, WordFast TXT, ical.)

This project paves the way for future improvements, like automatic file format detection, support for more software projects and extension of the ability to add files for translation by normal users via a web interface.

Localised Wikimedia shop
The Wikimedia Shop should be translatable. At least the interface, but why not the product descriptions as well.

Translator hub

 * ''Originally proposed as: Translate Roll

The number of wikis using translation extension has increased significantly. At translatewiki.net, in some rare cases people run out of things to translate. It would be benefical to have some kind of central place to see translation status across the Translate universe. It would facilitate cross-project collaboration and raise awareness of different wikis having different kinds of content to translate.

Various ideas have been floated for implementation, from one special page just listing overall translation coverage in each wiki for a given language, to a "blog roll" type of links across wikis as well as single sign-on systems to ease moving between wikis.

Wikidata translation
Translate extension could help people to translate properties and other content at Wikidata by the following ways:
 * Easy access to content which still needs translation
 * Statistics about translation coverage
 * Providing a familiar interface for translators

Complex message parameters
Currently using things like external links in messages is cumbersome. I propose we make a generic framework which allows writing translations, part of which are embedded inside the replacements itself, without resorting to lego or complicated markup in the messages. See 31032.

Language selection for anonymous users
Users of Wikimedia wikis need to register to change language. They should be able to change the interface language as well.