User:TJones (WMF)/Notes/Multilingual UI UX Concerns

Below are some notes from a discussion among some language nerds about what languages should be included when doing UI/UX testing in general, but especially for wikis, MediaWiki, and related projects.

It is far from complete, but hopefully it will be useful and touches upon some of the big gotchas that we've encountered. Feel free to add to it, or move this page to a more appropriate location.

Multilingual UI/UX Concerns

 * Languages to think about testing
 * English—it's probably relevant and it's often easy
 * RTL language (Hebrew (wiki) and Arabic (wiki) are the obvious big ones)—RTL always gets screwed up somewhere
 * See RTL:WTF for examples and explanations of how RTL text and especially mixed-direction text can be broken
 * Languages with long strings without spaces
 * Chinese (wiki) usually wraps okay on the web, but maybe your interface doesn't
 * Thai (wiki) uses spaces kinda like commas (between phrases) and also usually wraps okay
 * German has the occasional very long word—the longest word on the main page of dewiki when this list was first assembled was Gewerkschafts­dachverbands
 * Greenlandic is an Inuit language with a Wikipedia (the section with shorter words on the page is in Danish)—the longest word on the page is Uumassuseqartulerineq... but there are few short words in the Greenlandic section.
 * Tamil (wiki) has long words, and it seems that they don't automatically break within the word (like Latin script)
 * Languages with complex scripts
 * Khmer (wiki) and Myanmar/Burmese (wiki) are complex scripts (which can be rendered poorly.. if you see dotted circles ◌ something is definitely broken), but they also can be also very tall, and need more line height that other scripts.
 * "Verbose" languages
 * Some languages take significantly more characters to say the same thing. You can compare UI translated strings in Mediawiki core (github, browsable) (gerrit, check-out-able) to get a sense of languages that may be more verbose. See also the translatewiki "tip" below.
 * Multi-script languages
 * LanguageConverter is an on-wiki tool that allows you to re-map the text between scripts on the page. The link has a list. Some are easy and 100% predictable (e.g., Serbian (wiki)), some are messy and hard (Chinese (wiki) and Crimean Tatar (wiki)). Does your interface support multi-script switching?
 * Lots of issues listed under Internationalization and localization on the Software Testing page on-wiki
 * Current location of the source article from Microsoft
 * Tips and tricks
 * You can use the uselang URL parameter to change the page display, usually by adding  to the end of the URL (  if there are already other parameters)
 * will render the UI elements in English, so you can navigate more easily. Example.
 * will render the UI elements with their internal names—e.g., —so you can look them up in Mediawiki core more easily. Example. More info: qqx trick
 * At translatewiki, you can look up all the existing translations for a string, using this internal name. Short string example. Long (and very variable) example.
 * The Site Matrix lists all the wikis we have and what language they are in—you can use it to see what's out there, and to figure out the code for languages (they are usually standard, but then we have . More info: Special language codes)

Other Resources

 * Reading/QA/Sample articles: Articles with interesting properties that could be worth reviewing.