Universal Language Selector

'''This document is a work in progress. Comments are appreciated but this is not a final draft.'''

This document describes the design of a language selector widget. This document is intended to be a starting point for conversations about how to best handle language selection on Wikimedia sites.

IMPORTANT: This feature does not differentiate between selection of language at specific scales or contexts. That is, a user may choose to have Spanish as their "chrome" language, and then use the selector to view an article written in German, or maybe change their keyboard mapping to Russian, etc.

Rationale
Wikimedia properties exist in hundreds of languages. Simple selectors do not scale beyond ten or twenty items. Wikipedia is offered in 284 languages, and there are 424 available translations for MediaWiki.

Identified language selection contexts include but are not limited to:
 * Site-wide language selection (e.g., user preferences) for the site "chrome"
 * Editing language selection, especially for input modification (e.g., Narayam)
 * Interwiki link selection
 * Portal page language selection (such as on the Wikipedia home page).

We expect that making language selection easier and more intuitive:
 * Will allow for greater penetration into all languages
 * Will enable the sites to be more culturally neutral
 * Will promote cross-wiki collaboration

Feature Requirements


A detailed analysis on user profiles, requirements and ideas for the language selector have been collected in the interaction design framework. A summary is provided below.

User profiles

 * Nambi is a nurse from Paraguay that speaks Guaraní and Spanish. She contributes content in Guaraní and consumes content in both languages. She is a registered user and makes basic use of the existing language tools. Her goal is to contribute to the Guaraní community:
 * Contribute local content. Nambi realizes that there is not a Guaraní version of an article She is reading in the Spanish Wikipedia. She creates a new article based on a translation of the Spanish one. She keeps jumping from one version to the other during the translation process.
 * Features: Support recurrent language change among a small set, access to local-spoken languages, and indication of lack of language support.
 * Setting the UI. Nambi prefers the user interface menus to be in Guaraní. She becomes aware that registered users can customize their UI language and changes this.
 * Features: Discoverability of UI language settings, and distinction between UI and Content language settings.


 * Rakha is a musician from India, currently in Belgium. He speaks Tamil (native) and English (very basic). He searches and consumes content from the public PC at the hotel. He is not a registered user and he is not familiar with language tools. His goal is to feel like at home (despite being abroad):
 * Search in his language. Rakha accesses the Wikipedia in Tamil but the public PC at the hotel has a French keyboard. Rakha finds guidance on how to introduce a few Tamil characters for the search. When he finds little information on the subject, he switches to the English version for a moment.
 * Features: Discoverability of input method configuration, aids for accessing input method configuration during text input, terminology used to communicate input method configuration, and discoverability of relevant content in other languages.
 * Cross-language information. Rakha has accessed the Tamil version of Wikipedia, but he is looking for information on a Belgium dish he only saw written in French. He searches for the dish name in French, but he is interested to access the information in Tamil.
 * Features: Disable/re-enable input methods, and cross-language access of search results.

All users are important, but in order to achieve a focused design user profiles have been prioritized. Nambi and Rakha have been considered as primary users. The main selected users represent users that make use of language selection without a deep previous experience. The rationale is that optimizing interaction for the primary users will suffice either advanced users (since they are technically skilled) and non-technical users (which do not have a high demand for language tools).

Design goals
The following goals have been defined:


 * Discoverability. The selector and their features are easy to find.
 * Fluency. The solution should interfer minimally with the main flow of tasks for the user. Changing settings related to language should be fast and not distracting.
 * Learnability. The way in which the selector is operated and the effects of each action should be easy to anticipate.

Other general requirements to consider for the design of the selector are the following:


 * The language selector should be built as a "widget" that can be called from multiple contexts (e.g., can be repurposed)
 * The language selector should attempt to aid the user as much as possible
 * The language selector should be language agnostic

An initial plan for validation of the design solutions has been defined in the attached document.

Proposed designs
The solutions proposed will explore different design directions:
 * Helpful exploration. Define a selector that exposes the available options but provides as much help as it is possible to re-arrange information easily.
 * Minimal contextual selector. Define a selector that is almost invisible, but offers help where it is needed. By taking advantage of contextual information, it provides the most likely options.

In addition to these, the original proposal for the selector has been also included.

Helpful Exploration
This version of the selector is composed by three elements: the trigger, the list of languages and language-related settings.

In addition, some aids are provided in specific contexts such as the selection of multiple languages and when the user is editing.

The trigger
The trigger is represented by a short list of languages the user may be interested in. Languages are presented in their own script, and languages using the same (or similar) scripts are placed together to allow consistent alphabetical ordering. If the number of possible languages is greater than eight, the trigger will show six in the list plus a link indicating that there are more languages available (indicating the number of languages). This acts as a link that activates the language list for users to select a language.

The criteria to select these six languages is the following:
 * Languages defined in browser settings.
 * Contributed and consumed languages by the user (if this information is available).
 * Language defined in the user settings (if this information is available).
 * Languages spoken in the current geographic area.
 * Previously selected languages.
 * English as a global default.

Likely languages which are not available in a specific context are also included in the list but in a way that indicate they are lacking (greyed-out or using red links if it is possible to trigger the creation of the lacking content). These languages are included since users may be interested in both accessing content and finding whether content is available in their favorite languages.

The trigger may include an indication of the current language that is also used for activating the language list.

The language list
When the selector is activated, the language list is shown. The language list provides several mechanisms to reduce the complexity of dealing with a long list of languages.

Search based filtering. The selector allows users to search for a specific language. The number of languages shown in the list will be reduced as the user types his search criteria.
 * Languages can be found by typing either their name (in any language), their ISO code or the country where they are spoken. Due technical limitations, language searches can be limited to language names in their own language and few major languages.
 * When the selector is open, the search box will get the input focus. Focus should not be automatically gained for users accessing form a tablet or mobile device, since it can trigger their virtual keyboard.
 * When searching, an aid for clearing the search will be shown (e.g., an X at the right of the search box).
 * When searching, the first of the results will be strongly highlighted to indicate that can be selected by hitting the enter key. This allows users to complete keyboard-only language selection.
 * When searching, the geographic filter is only applied if there are enough results to fill a complete page. When few or no results are provided for the current text entry, the geographic selector includes all regions to avoid results to be hidden by mistake (and indicates this graphically).

Geographic filtering. A World map divided vertically in three regions (with labels) is used to reduce the number of languages to the ones spoken in a specific region.
 * By default the filter is activated to show only the languages spoken in the current region the user is located.
 * Although the map represents broad regions, the list of languages can be ordered differently depending on the user location. For example, a user accessing from Africa will be provided African languages on top of the list.
 * Languages spoken in multiple regions such as English or Spanish will be shown for all the regions where they are spoken.

The fact that the World map is divided in broad regions with the current user location provides a simple way to reduce the language list to one third without requiring the user much geography literacy.

List of languages. Languages are shown in a multi-column grid.
 * Languages are organized in 5 columns with 10-items per column and two rows of items initially visible (to invite for scrolling if needed).
 * Since vertical scrolling may be needed, languages are grouped in clusters in a way that communicate the ordering of scanning to follow alphabetical order.
 * A category is defined for each geographic sub-region (e.g.: Europe, Africa...).
 * The first region to be placed on top depends on the user current location.
 * Any language that is likely to be selected by the user (according to the same criteria followed with the trigger) will be highlighted in the list.


 * Languages with the same type of script will be displayed together. This allows the user to focus on the region of the list where more familiar text appears (see the image above).
 * Scripts with the greater number of languages will be placed first.
 * Script regions should avoid fragmentation and have a clear starting point.
 * Two scripts cannot be in the same cluster if they span more than one cluster.
 * If a set of languages spans multiple rows it should start at the first column.
 * When scrolling is needed, a link placed at the bottom right indicates that more languages are available. By clicking on the link, the list will be smoothly scrolled to the next block of results (the next two rows which were not visible just below fold).

Language settings
The selector includes links to two kinds of language-related settings:
 * Input settings. The user can select a different language for text input and enable specific input methods for this language. The options provided are the following:
 * Language for input. The current language used for input is shown with a list of likely languages to facilitate the selection. Although the initial criteria is the same used for the trigger, in this context previous selected input languages are prioritized.
 * Input methods. For each language a set of input methods are provided grouped by types. For each type a brief description (and a help link) are provided. When several options are available, the appropriate control will be shown for selection. Conversely, when no options exist, widgets for selection must not be shown. For example, a language for which no methods are provided, will not show this section at all.
 * Display settings. The user can change the language of the user interface and made adjustments related to fonts settings.

Both setting views involve a

Arun Ganesh's Original design
At the 2011 Mumbai Hackathon, a small team began discussions addressing a major problem facing Wikimedia sites: language selection. The team put their heads together and came up with several ideas. The basic workflow was captured by Arun Ganesh; he put together a pdf file.

Process
Once the selector is shown, the user may go about selecting the language of choice.

Note that all languages are to be displayed in their native spelling and with native fonts. Additionally, ISO codes should be displayed next to them.

The selector should begin with two panes, one on top of each other.

Top Pane
The first pane shall include a flattened map of the world divided into basic "regions" (not nation borders). The system should auto-detect the region that the user is located in (either via geoip lookup or GPS coordinate lookup in the case of mobile phones) and have that region auto-selected (and displayed in the lower pane).

Additionally, there will be an ordered list of the world's most common languages displayed (for quick selection).

The map is hot - selecting any region will reload the bottom pane with that region's languages.

Bottom Pane
This pane includes a language search function as well as two disparate lists of languages, in sectioned columnar format.

The first section shall show between 5 and 10 of the most common languages in the selected region, ordered by number of speakers.

The second section will again list out languages used in the region. However, this list will include as many as can be included. This list is ordered (alphabetically? by iso code?). Simple pagination controls (next/previous) will allow the user to browse the full list if it does not display within the confines of the dialog.

Specification Callouts
''This text is copied and pasted directly from the PDF file. Formatting has been wikified slightly.''

The widget has 4 parts: 1. Widget shortcut:
 * a. Permanent link displaying selected language alongwith icons to indicate it is language related.
 * b. Seperate icon to directly open keyboard input or on-screen keyboard widget

2. Primary selector:
 * a. Static world map with marker showing user’s location guessed from geolocation user timezone or system language options. Clicking map changes the data in the secondary selector (3)
 * b. Static list of primary gloabal languages or biggest wikipedia projects. This enables quickly selecting a major language which has a high probability of usage.

3. Secondary selector:
 * a. Text box allowing a user to search language or countries. Autocomplete list
 * b. Country flag icons based on the region selected in the primary selector map or guessed user location. User can select his country flag if he identifies it
 * c. Regional map of the user’s area of interest. No markers needed, but clicking can change the ordering of languages
 * d. One column listing primary languages of the region or country chosen
 * e. Two columns with a comprehensive listing of all the minor languages. 3 text sizes to indicate the number of speakers of the language in the region.

4. Keyboard Input:
 * a. Options for keyboard mapping. Selecting a language would automatically load the available options and switch to the recommended input method for that language. Disable option switches to system settings
 * b. Keyboard mapping reference and on-screen keyboard. Dockable widget which can be dragged and detached from the language widget
 * c. Dropdown font selector to switch webfonts
 * d. Link to typing help or reference material

The universal language selector will allow a user friendly method for a user to switch to his native language from a foreign language interface with the help of visual aids.

The language selector widget can be activated by a link placed as close to the top-left or top-right of the page as possible. This is usually the first place a person looks for customization options in a foreign interface. Once activated, the widget expands to display the language selection options.

Related bug reports

 * 33677 Enable setting language preference without requiring login.
 * 28900 Add a keyboard layout display option to Narayam.
 * 33460 WebFonts "Select font" portlet menu should only be shown if there are options to choose from
 * 30595 Inconsistent search results with language links.
 * 28970 There's no clean way to indicate the directionality of the content of the page.