Localisation/pt


 * For the Wikimedia Foundation localisation team, see .
 * For translating pages on this wiki, see .

This page gives a technical description of MediaWiki's internationalisation and localisation (i18n and L10n) system, and gives hints that coders should be aware of. Our mantra is that i18n must not be an afterthought: it's an essential component since the earliest phases of your software, as well as one of the core of MediaWiki.

translatewiki.net
translatewiki.net supports in-wiki translation of all the messages of core, extensions and skins. If you would like to have nothing to do with all the technicalities in this page about editing files, Git, creating patches, and so forth, go directly to translatewiki.net.

All translation of MediaWiki user interface messages should go through translatewiki.net and not committed directly to code. Only the English messages and their initial documentation must be done in the source code.

Core MediaWiki and extensions must use system messages for any text displayed in the user interface. For an example of how to do this, please see. If the extension is well written, it will probably be included in in a few days, after its staff notices it on. If it's not noticed, contact them. If it's too unstable to be translated, note so in the code or commit and contact them if necessary.

See also Overview of the localisation system and What can be localised.

Finding messages
explains how to find a particular string you want to translate. In particular, note the, which was introduced in.

i18n mailing list
You can subscribe to the i18n list. At the moment it is low-traffic.

Code structure
First, you have a Language object in. This object contains all the localisable, as well as other important language-specific settings and custom behaviour (uppercasing, lowercasing, printing dates, formatting numbers, , etc.).

The object is constructed from two sources: sub-classed versions of itself (classes) and Message files (messages).

There's also the MessageCache class, which handles input of text via the MediaWiki namespace. Most internationalisation is nowadays done via objects and by using the   shortcut function, which is defined in. Legacy code might still be using the old  functions, which are now considered deprecated in favour of the above-mentioned Message objects.

General use (for developers)
See also.

Language objects
There are two ways to get a language object. You can use the globals and  for user interface and content language respectively. For an arbitrary language you can construct an object by using, by replacing   with the code of the language. You can also use  if   could already be a language object. The list of codes is in.

Language objects are needed for doing language-specific functions, most often to do number, time and date formatting, but also to construct lists and other things. There are multiple layers of caching and merging with, but the details are irrelevant in normal use.

Using messages
MediaWiki uses a central repository of messages which are referenced by keys in the code. This is different from, for example,, which just extracts the translatable strings from the source files. The key-based system makes some things easier, like refining the original texts and tracking changes to messages. The drawback is of course that the list of used messages and the list of source texts for those keys can get out of sync. In practice this isn't a big problem, and the only significant problem is that sometimes extra messages that are not used anymore still stay up for translation.

To make message keys more manageable and easy to find, also with grep, always write them completely and don't rely too much on creating them dynamically. You may concatenate parts of message keys if you feel that it gives your code better structure, but put a comment nearby with a list of the possible resulting keys. For example:

The detailed use of message functions in PHP and JavaScript is on.

Adding new messages
See also:

Choosing the message key
See also:

The message key must be globally unique. This includes core MediaWiki and all the extensions and skins.

Stick to lower case letters, numbers and dashes in message names; most other characters are between less practical or not working at all. Per MediaWiki convention, first character is case-insensitive and other chars are case-sensitive.

Please follow global or local conventions for naming. For extensions, use a standard prefix, preferably the extension name in lower case, followed by a hyphen ("-"). Exceptions are:


 * Messages used by the API. These must begin with,  ,  . After this prefix put the extension prefix.
 * Log-related messages. These must begin with,  ,.
 * User rights. The key for the name of the right as displayed on Special:ListGroupRights must begin with . The name of the action that completes the sentence "" must begin with
 * Revisions tags must begin with.
 * Special page titles must begin with.

Other things to note when creating messages

 * 1) Make sure that you are using suitable handling for the message (parsing,  -replacement, escaping for HTML, etc.)
 * 2) If your message is part of core, it should usually be added to , although some components, such as Installer and ApiHelp have their own message files.
 * 3) If your message is in an extension add it to the   file or the   file in the appropriate subdirectory. If an extensions has a lot of messages, you may create subdirectories under  . All the message directories, including the default , must be listed in the   section in   or in the  variable.
 * 4) Take a pause and consider the wording of the message. Is it as clear as possible? Can it be misunderstood? Ask for comments from other developers or localisers if possible. Follow the #internationalisation hints.
 * 5) Add documentation to   in the same directory. Read more about message documentation.

Messages that should not be translated

 * 1) Ignored messages are those which should exist only in the English messages file. They are messages that should not need translation, because they reference only other messages or language-neutral features, e.g. a message of " ".
 * 2) Optional messages may be translated only if changed in the target language.

To flag such messages:


 * (optionally) use the template in the  message documentation, that is respectively
 * or
 * (required) tell the extension used on  what to do with the messages by submitting a patch listing them as appropriate (see also ):
 * for core, in add the message keys
 * under  or
 * under ;
 * for extensions, in add a line under the extension's name like
 * or
 * or

Removing existing messages
Remove it from  and. Don't bother with other languages. Updates from will handle those automatically.

Changing existing messages

 * 1) Consider updating the message documentation (see #Adding new messages).
 * 2) Change the message key if old translations are not suitable for the new meaning. This also includes changes in message handling (parsing, escaping, parameters, etc.). Improving the phrasing of a message without technical changes is usually not a reason for changing a key. At translatewiki.net, the translations will be marked as outdated so that they can be targeted by translators. Changing a message key does not require talking to the i18n team or filing a support request. However, if you have special circumstances or questions, ask in [Irc://irc.freenode.org/mediawiki-i18n #mediawiki-i18n] or in the support page at.
 * 3) If the extension is supported by, please only change the English source message and/or key, and the accompanying entry in  . If needed, the translatewiki.net team will take care of updating the translations, marking them as outdated, cleaning up the file or renaming keys where possible. This also applies when you're only changing things like HTML tags which you could change in other languages without speaking those languages. Most of these actions will take place in translatewiki.net and will reach Git with about one day of delay.

Localising namespaces and special page aliases
and special page names (i.e. "RecentChanges" in "Special:RecentChanges") are also translatable.

Namespaces


Currently making namespace name translations is disabled on translatewiki.net, so you need to do this yourself in Gerrit, or file a task asking for someone else to do it.

To allow custom namespaces introduced by your extension to be translated, create a  file that looks like this:

Then load the namespace translation file in  via

Now, when a user installs MyExtension on their Finnish (fi) wiki, the custom namespace will be translated into Finnish magically, and the user doesn't need to do a thing!

Also remember to register your extension's namespace(s) on the page.

Special page aliases
See for up-to-date information. The following does not appear to be valid.

Create a new file for the special page aliases in this format:

Then load it in the extension's setup file like this:

When your special page code uses either  or   (in the class that provides Special:MyExtension), the localised alias will be used, if it's available.

Message parameters
Some messages take parameters. They are represented by,  ,  , … in the (static) message texts, and replaced at run time. Typical parameter values are numbers (the "3" in "Delete 3 versions?"), or user names (the "Bob" in "Page last edited by Bob"), page names, links and so on, or sometimes other messages. They can be of arbitrary complexity.

The list of parameters defined for each specific message is placed in special file "qqq.json" located in "languages/" folder of MediaWiki - read more in documentation.

It's preferable to use whole words with the PLURAL, GENDER, and GRAMMAR magic words. For example,  is better than. It makes searching easier.

Switches in messages…

 * See also .

Parameters values at times influence the exact wording, or grammatical variations in messages. We don't resort to ugly constructs like "$1 (sub)page(s) of his/her userpage", because these are poor for users and we can do better. Instead, we make switches that are parsed according to values that will be known at run time. The static message text then supplies each of the possible choices in a list, preceded by the name of the switch, and a reference to the value that makes a difference.

This resembles the way are called in MediaWiki. Several types of switches are available. These only work if you do full parsing, or -transformation, for the messages.

…on numbers via PLURAL

 * See also .

MediaWiki supports plurals, which makes for a nicer-looking product. For example:

If there is an explicit plural form to be given for a specific number, it is possible with the following syntax

Be aware of PLURAL use on all numbers

 * See also: Plural

When a number has to be inserted into a message text, be aware that some languages will have to use PLURAL on it even if always larger than 1. The reason is that PLURAL in languages other than English can make very different and complex distinctions, comparable to English 1st, 2nd, 3rd, 4th, … 11th, 12th, 13th, … 21st, 22nd, 23rd, … etc.

Do not try to supply three different messages for cases like "no items counted", "one item counted", "more items counted". Rather, let one message take them all, and leave it to translators and PLURAL to properly treat any possible differences of presentation for them in their respective languages.

Always include the number as a parameter if possible. Always add  syntax to the source messages if possible, even if it makes no sense in English. The syntax guides translators.

Fractional numbers are supported, but the plural rules may not be complete.

Pass the number of list items as parameters to messages talking about lists
Don't assume that there's only singular and plural. Many languages have more than two forms, which depend on the actual number used and they have to use grammar varying with the number of list items when expressing what is listed in a list visible to readers. Thus, whenever your code computes a list, include  as parameter to headlines, lead-ins, footers and other messages about the list, even if the count is not used in English. There is a neutral way to talk about invisible lists, so you can have links to lists on extra pages without having to count items in advance.

…on user names via GENDER

 * See also .

If you refer to a user in a message, pass the user name as parameter to the message and add a mention in the message documentation that gender is supported. If it is likely that GENDER will be used in translations for languages with gender inflections, add it explicitly in the English language source message.

If you directly address the currently logged-in user, leave the user name as parameter empty:

If you include the user name into the message (e.g. ""), consider passing it through  first, to ensure that characters like   or   are not interpreted.

Users have grammatical genders

 * See also Gender

When a message talks about a user, or relates to a user, or addresses a user directly, the user name should be passed to the message as a parameter. Thus languages having to, or wanting to, use proper gender dependent grammar, can do so. This should be done even when the user name is not intended to appear in the message, such as in "inform the user on his/her talk page", which is better made "inform the user on talk page" in English as well.

This does not mean that you are encouraged to "sexualise" messages' language: please use gender-neutral language whenever this can be done with clarity and precision.

…on use context inside sentences via GRAMMAR

 * See also .

Grammatical transformations for agglutinative languages is also available. For example for Finnish, where it was an absolute necessity to make language files site-independent, i.e. to remove the Wikipedia references. In Finnish, "about Wikipedia" becomes "Tietoja Wikipediasta" and "you can upload it to Wikipedia" becomes "Voit tallentaa tiedoston Wikipediaan". Suffixes are added depending on how the word is used, plus minor modifications to the base. There is a long list of exceptions, but since only a few words needed to be translated, such as the site name, we didn't need to include it.

MediaWiki has grammatical transformation functions for over 20 languages. Some of these are just dictionaries for Wikimedia site names, but others have simple algorithms which will fail for all but the most common cases.

Even before MediaWiki had arbitrary grammatical transformation, it had a nominative/genitive distinction for month names. This distinction is necessary for some languages if you wish to substitute month names into sentences.

Filtering special characters in parameters and messages
The other (much simpler) issue with parameter substitution is HTML escaping. Despite being much simpler, MediaWiki does a pretty poor job of it.

Message documentation
There is a pseudo-language code  for message documentation. It is one of the ISO 639 codes reserved for private use. There, we do not keep translations of each message, but collect English sentences about each message: telling us where it is used, giving hints about how to translate it, and enumerating and describing its parameters, link to related messages, and so on. In translatewiki.net, these hints are shown to translators when they edit messages.

Programmers must document each and every message. Message documentation is an essential resource – not just for translators, but for all the maintainers of the module. Whenever a message is added to the software, a corresponding  entry must be added as well; revisions which don't do so are marked " " until the documentation is added.

Documentation in  files should be edited directly only when adding new messages or when changing an existing English message in a way that requires a documentation change, for example adding or removing parameters. In other cases, documentation should usually be edited in translatewiki. Each documentation string is accessible at https://translatewiki.net/wiki/MediaWiki: message-key /qqq, as if it were a translation. These edits will be exported to the source repositories along with the translations.

Useful information that should be in the documentation includes:
 * 1) Message handling (parsing, escaping, plain text).
 * 2) Type of parameters with example values.
 * 3) Where the message is used (pages, locations in the user interface).
 * 4) How the message is used where it is used (a page title, button text, etc.).
 * 5) What other messages are used together with this message, or which other messages this message refers to.
 * 6) Anything else that could be understood when the message is seen on the context, but not when the message is displayed alone (which is the case when it is being translated).
 * 7) If applicable, notes about grammar. For example, "open" in English can be both a verb and an adjective. In many other languages the words are different and it's impossible to guess how to translate them without documentation.
 * 8) Adjectives that describe things, such as "disabled", "open" or "blocked", must always say what are they describing. In many languages adjectives must have the gender of the noun that they describe. It may also happen that different kinds of things need different adjectives.
 * 9) If the message has special properties, for example, if it is a page name, or if it should not be a direct translation, but adapted to the culture or the project.
 * 10) Whether the message appears near other message, for example in a list or a menu. The wording or the grammatical features of the words should probably be similar to the messages nearby. Also, items in a list may have to be properly related to the heading of the list.
 * 11) Parts of the message that must not be translated, such as generic namespace names, URLs or tags.
 * 12) Explanations of potentially unclear words, for example abbreviations, like "CTA", or specific jargon, like "template", "suppress" or "stub". (Note that it's best to avoid such words in the first place!)
 * 13) Screenshots are very helpful. Don't crop – an image of the full screen in which the message appears gives complete context and can be reused in several messages.

A few other hints:
 * Remember that very, very often translators translate the messages without actually using the software.
 * Most usually, translators do not have any context information, neither of your module, nor of other messages in it.
 * A rephrased message alone is useless in most circumstances.
 * Don't use designers' jargon like "nav" or "comps".
 * Consider writing a glossary of the technical terms that are used in your module. If you do it, link to it from the messages.

You can link to other messages by using. Please do this if parts of the messages come from other messages (if this cannot be avoided), or if some messages are shown together or in same context.

translatewiki.net provides some default templates for documentation: Have a look at the template pages for more information.
 * for  messages
 * for  messages
 * for messages around user groups (, ,  ,   and  )
 * for  messages

Internationalisation hints
Besides documentation, translators ask to consider some hints so as to make their work easier and more efficient and to allow an actual and good localisation for all languages. Even if only adding or editing messages in English, one should be aware of the needs of all languages. Each message is translated into more than 300 languages and this should be done in the best possible way. Correct implementation of these hints will very often help you write better messages in English, too.

These are the main places where you can find the assistance of experienced and knowledgeable people regarding i18n:
 * The support page of translatewiki.net.
 * The #mediawiki-i18n irc channel on http://freenode.net.

Please do ask there!

Use Message parameters and switches properly
That's a prerequisite of a correct wording for your messages.

Avoid message re-use
The translators discourage message re-use. This may seem counter-intuitive, because copying and duplicating code is usually a bad practice, but in system messages it is often needed. Although two concepts can be expressed with the same word in English, this doesn't necessarily mean they can be expressed with the same word in every language. "OK" is a good example: in English this is used for a generic button label, but in some languages they prefer to use a button label related to the operation which will be performed by the button. Another example is practically any adjective: a word like "multiple" changes according to gender in many languages, so you cannot reuse it to describe several different things, and you must create several separate messages.

If you are adding multiple identical messages, please add message documentation to describe the differences in their contexts. Don't worry about the extra work for translators. Translation memory helps a lot in these while keeping the flexibility to have different translations if needed.

Avoid fragmented or 'patchwork' messages
Languages have varying word orders, and complex grammatical and syntactic rules. It's very hard to translate "lego" messages, that is messages formed by multiple pieces of text, possibly with some indirection (also called "string concatenation").

It is better to make every message a complete phrase. Several sentences can usually be combined much more easily be into a text block, if needed. When you want to combine several strings in one message, pass them in as parameters, as translators can order them correctly for their language when translating.

Messages quoting each other
An exception from the rule may be messages referring to one another: 'Enter the original author's name in the field labelled " " and click "  " when done'. This makes the message consistent when a software developer or wiki operator alters the messages "name" or "proceed" later. Without the int-hack, developers and operators would have to be aware of all related messages needing adjustment, when they alter one.

Separate times from dates in sentences
Some languages have to insert something between a date and a time which grammatically depends on other words in a sentence. Thus, they will not be able to use date/time combined. Others may find the combination convenient, thus it is usually the best choice to supply three parameter values (date/time, date, time) in such cases, and in each translation leave either the first one or last two unused as needed.

Avoid in messages
has several disadvantages. It can be anything (acronym, word, short phrase, etc.) and, depending on language, may need the use of  on each occurrence. No matter what, each message having  will need review in most wiki languages for each new wiki on which your code is installed. In the majority of cases, when there is not a general  configuration for a language, wiki operators will have to add or amend PHP code so as to get   for   working. This requires both more skills, and more understanding, than otherwise. It is more convenient to have generic references like "this wiki". This does not keep installations from locally altering these messages to use, but at least they don't have to, and they can postpone message adaption until the wiki is already running and used.

Avoid references to visual layout and positions
What is rendered where depends on skins. Most often screen layouts of languages written from left-to-right are mirrored compared to those used for languages written from right-to-left, but not always, and for some languages and wikis, not entirely. Handheld devices, narrow windows, and so on may show blocks underneath each other, that would appear side-by-side on larger displays. Since site- and user-written JavaScript scripts and gadgets can, and do, hide parts, or move things around in unpredictable ways, there is no reliable way of knowing the actual layout.

It is wrong to tie layout information to content languages, since the user interface language may not be the page's content language, and layout may be a mixture of the two depending on circumstances. Non-visual user agents like acoustic screen readers and other auxiliary devices do not even have a concept of visual layout. Thus, you should not refer to visual layout positions in the majority of cases, though semantic layout terms may still be used ("previous steps in the form", etc.).

MediaWiki does not support showing different messages or message fragments based on the current directionality of the interface (see T30997).

The upcoming browser and MediaWiki support for East and North Asian top-down writing will make screen layouts even more unpredictable, with at least eight possible layouts (left/right starting position, top/bottom starting position, and which happens first).

Avoid references to screen colours
The colour in which something is rendered depends on many factors, including skins, site- and user-written JavaScript scripts and gadgets, and local user agent over-rides for reasons of accessibility or technological limitations. Non-visual user agents like acoustic screen readers and other auxiliary devices do not even have a concept of colour. Thus, you should not refer to screen colours. (You should also not rely on colour alone as a mechanism for informing the user of state, for the same reason.)

Have message elements before and after each input field

 * This is a suggested guideline, has not become standard in MediaWiki development

While English allows efficient use of prompting in the form item–colon–space–input-field, many other languages don't. Even in English, you often want to use "Distance: ___ metres" rather than "Distance (in metres): ___". Leaving  elements aside, you should think of each and every input field following the "Distance: ___ metres" pattern. So:
 * give it two messages, even if the 2nd one is empty in English and some other languages, or
 * allow the placement of inputs via  parameters.

Avoid untranslated HTML markup in messages
HTML markup not requiring translation, such as enclosing s, rulers above or below, and similar, should usually not be part of messages. They unnecessarily burden translators, increase message file size, and pose the risk to accidentally being altered or skipped in the translation process. In general, avoid raw HTML in messages if you can.

Messages are often longer than you think!
Skimming foreign language message files, you find messages almost never shorter than Chinese ones, rarely shorter than English ones, and most usually much longer than English ones.

Especially in forms, in front of input fields, English messages tend to be terse, and short. That is often not kept in translations. Especially genuinely non-technical third world languages, vernacular, mediæval, or ancient languages require multiple words or even complete sentences to explain foreign, or technical, prompts. For example, the brief English message "TSV file:" may have to be translated in a language as literally:"Please type a name here which denotes a collection of computer data that is comprised of a sequentially organised series of typewritten lines which themselves are organised as a series of informational fields each, where said fields of information are fenced, and the fences between them are single signs of the kind that slips a typewriter carriage forward to the next predefined position each. Here we go: _____ (thank you)"This is, admittedly, an extreme example, but you get the trait. Imagine this sentence in a column in a form where each word occupies a line of its own, and the input field is vertically centered in the next column. :-(

Avoid using very close, similar, or identical words to denote different things, or concepts
For example, pages may have older revisions (of a specific date, time, and edit), comprising past versions of said page. The words revision, and version can be used interchangeably. A problem arises, when versioned pages are revised, and the revision, i.e. the process of revising them, is being mentioned, too. This may not pose a serious problem when the two synonyms of "revision" have different translations. Do not rely on that, however. It is better to avoid the use of "revision" aka "version" altogether, then, so as to avoid it being mis-interpreted.

Basic words may have unforeseen connotations, or not exist at all
There are some words that are hard to translate because of their very specific use in MediaWiki. Some may not be translated at all. For example, there is no word "user" relating to "someone who uses something" in several languages. Similarly, in Kölsch the English words "namespace" and "apartment" translate the same word. Sticking to Kölsch, they say "corroborator and participant" in one word since any reference to "use" would too strongly imply "abuse" as well. The term "wiki farm" is translated as "stable full of wikis", since a single-crop farm would be a contradiction in terms in the language, and not understood, etc..

Expect untranslated words

 * This is a suggested guideline, has not yet become standard in MediaWiki development

It is not uncommon that proper names, tag names, etc. and computerese in English are not translated, and instead taken as loan-words, or foreign words. In the latter case, some particularly-fastidious translators may mark such words as belonging to another language with HTML markup, such as  ….

You may want to consider ensuring that your message output handler passes such markup along unmolested, despite the obvious security risks.

Permit explanatory inline markup

 * This is a suggested guideline, has not yet become standard in MediaWiki development

Sometimes there are abbreviations, technical terms, or generally ambiguous words in target languages that may not be immediately understood by newcomers, but are obvious to experienced computer users. So as to avoid screen clutter of lengthy explanations without leaving newcomers stranded, translators may choose to add explanations as  annotations, shown by browsers when you move the mouse over them.

For example, the MediaWiki core message  about image rotation, which in English is simply " ", in Moroccan Arabic is translated as:

giving:
 * mḍwwer 90° ĜĜS

explaining the abbreviation for "counter clockwise" when needed.

You may want to consider ensuring that your message output handler passes such markup along unmolested, even if the original message does not use them.

Use, , and tags where needed
When talking about technical parameters, values, or keyboard inputs, mark them appropriately as such using the HTML tags,  , or. Thus they are typographically set off form the normal text. That clarifies their sense to readers, avoiding confusion, errors and mis-representations. Ensure that your message handler allows such markup.

Symbols, colons, brackets, etc. are parts of messages
Many symbols are localisable, too. Some scripts have other kinds of brackets than the Latin script has. A colon may not be appropriate after a label or input prompt in some languages. Having those symbols included in messages helps to make better and less Anglo-centric translations, and also reduces code clutter.

For example, there are different quotation mark conventions used in «Norwegian», »Swedish», »Danish«, „German“, and 「Japanese」.

If you need to wrap some text in localized parentheses, brackets, or quotation marks, you can use the   or    or    messages like so:

Do not expect symbols and punctuation to survive translation
Languages written from right to left (as opposed to English) usually swap arrow symbols being presented with "next" and "previous" links, and their placement relative to a message text may, or may not, be inverted as well. Ellipsis may be translated to "etc.", or to words. Question marks, exclamation marks, colons will be placed other than at the end of a sentence, not at all, or twice. As a consequence, always include all of those in the text of your messages, and never try to insert them programmatically.

Use full stops
Do terminate normal sentences with full stops. This is often the only indicator for a translator to know that they are not headlines or list items, which may need to be translated differently.

Wikitext of links
Link anchors can be put into messages in several technical ways:
 * 1) via wikitext: …   …,
 * 2) via wikitext: …  …, or
 * 3) the anchor text is a message in the MediaWiki namespace. Avoid it!

The latter is often hard or impossible to handle for translators, avoid fragmented or 'patchwork' messages here, too. Make sure that " " does not contain spaces.

Use meaningful link anchors
Take care with your wording. Link anchors play an important role in search engine assessment of pages – both the words linked, and the target anchor. Make sure that the anchor describes the target page well. Always avoid commonplace and generic words. For example, "Click here" is an absolute no-go, since target pages are almost never about "click here". Do not put that in sentences around links either, because "here" was not the place to click. Instead, Use precise action words telling what a user will get to when following the link, such as "You can upload a file if you wish."

See also Help users predict where they are going and mystery meat navigation.

Avoid jargon and slang
Avoid developer and power user jargon in messages. Try to use a simple language whenever possible.

Avoid saying "success", "successfully", "fail", "error occurred while", etc., when you want to notify the user that something happened or didn't happen. This comes from developers' seeing everything as true or false, but users usually just want to know what actually happened or didn't, and what they should do about it (if at all). So:
 * "The file was successfully renamed" -> "The file was renamed"
 * "File renaming failed" -> "There is a file with this name already. Please choose a different name."

One sentence per line

 * This is a suggested guideline, has not yet become standard in MediaWiki development

Try to have one sentence or similar block in one line. This helps to compare the messages in different languages, and may be used as an hint for segmentation and alignment in translation memories.

Be aware of whitespace and line breaks
MediaWiki's localised messages usually get edited within the wiki, either by wiki operations on live wikis, or by the translators on translatewiki.net. You should be aware of how whitespace, especially at the beginning or end of your message, will affect editors:


 * Newlines at the beginning or end of a message are fragile, and will be frequently removed by accident. Start and end your message with active text; if you need a newline or paragraph break around it, your surrounding code should deal with adding it to the returned text.
 * Spaces at the beginning or end of a message are also likely to be removed during editing, and should be avoided. If a space is required for output, usually your code should append it or else you should use a non-breaking space such as &amp;nbsp; (in which case check your escaping settings!)

Use standard capitalisation
Capitalisation gives hints to translators as to what they are translating, such as single words, list or menu items, phrases, or full sentences. Correct (standard) capitalisation may also play a role in search engines' assessment of your pages. MediaWiki uses sentence case (The quick brown fox jumps over the lazy dog) in interface messages.

Always remember that many writing systems don't have capital letters at all, and some of those that do have them, use them differently from English. Therefore, don't use ALL-CAPS for emphasis. Use CSS, or HTML  or   per below:

Emphasis
In normal text, emphasis like boldface or italics and similar should be part of message texts. Local conventions on emphasis often vary, especially some Asian scripts have their own. Translators must be able to adjust emphasis to their target languages and areas. Try to use " " and " " in your user interface to allow mark-up on a per language or per script basis.

In modern screen layouts of English and European styles, emphasis becomes less used. Do convey it in your message documentation still, as it may give valuable hints as to how to translate. Emphasis can and should be used in other cultural contexts as appropriate, provided that translators know about it.

Update of localisation
As mentioned above, translation happens on translatewiki.net and other systems are discouraged. Here's a high level overview of the localisation update workflow:
 * Developers.
 * Users translate the new or changed system messages on translatewiki.net.
 * Automated tools export these messages, build new versions of the message files, incorporating the added or updated messages, for both core and extensions, and commit them to git.
 * The wikis then can pull in the updated system messages from the git repository.

Wikimedia projects and any other wikis can benefit immediately and automatically from localisation work thanks to the extension. This compares the latest English messages to the English messages in production. If they are not the same, the production translations are updated and made available to users.

Once translations are in the version control system, the Wikimedia Foundation has a daily job that updates a checkout or clone of the extension repository. This was first established in September 2009.

Because changes on translatewiki.net are pushed to the code daily as well, this means that each change to a message can potentially be applied to all existing MediaWiki installations in a couple days without any manual intervention or traumatic code update.

As you can see this is a multi-step process. Over time, we have found out that many things can go wrong. If you think the process is broken, please make sure to report it on our Support page, or create a new bug in Phabricator. Always be sure to describe a precise observation.

Handling support requests

 * Main page: translatewiki:Translating:Localisation for developers.

Translators may have questions about some of the messages you create. Translatewiki.net provides a support request system that allows translators the ability to ask you, the project owner, questions regarding messages so that they can be better translated. This short tutorial guides you through the workflow of handling translatewiki.net support requests.

Message sources
Code looks up from these sources:


 * The MediaWiki namespace. This allows wikis to adopt, or override, all of their messages, when standard messages do not fit or are not desired (see #Old local translation system).
 * MediaWiki:Message-key is the default message,
 * MediaWiki:Message-key/language-code is the message to be used when a user has selected a language other than the wiki's default language.
 * From message files:
 * Core MediaWiki itself and most currently maintained s use a file per language, named, where zyx is the language code for the language.
 * Some older extensions use a combined message file holding all messages in all languages, usually named.
 * Many Wikimedia Foundation wikis access some messages from the extension, allowing them to standardise messages across WMF wikis without imposing them on every MediaWiki installation.
 * A few extensions use other techniques.

Caching
System messages are one of the more significant components of MediaWiki, primarily because it is used in every web request. The PHP message files are large, since they store thousands of message keys and values. Loading this file (and possibly multiple files, if the user's language is different from the content language) has a large memory and performance cost. An aggressive, layered caching system is used to reduce this performance impact.

MediaWiki has lots of caching mechanisms built in, which make the code somewhat more difficult to understand. Since 1.16 there is a new caching system, which caches messages either in. files or in the database. Customised messages are cached in the filesystem and in (or alternative), depending on the configuration.

The table below gives an overview of the settings involved:

In MediaWiki 1.27.0 and 1.27.1, the autodetection was changed to favor the file backend. In case  (the default), the file backend is used with the path from. If this value is not set (which is the default), a temporary directory determined by the operating system is used. If a temporary directory cannot be detected, the database backend is used as a fallback. This was reverted from 1.27.2 and 1.28.0 because of conflict of files on shared hosts and security issues (see T127127 and T161453).

Function backtrace
To better visually depict the layers of caching, here is a function backtrace of what methods are called when retrieving a message. See the below sections for an explanation of each layer.

MessageCache
The  class is the top level of caching for messages. It is called from the Message class and returns the final raw contents of a message. This layer handles the following logic: The last bullet is important. allow MediaWiki to fall back on another language if the original does not have a message being asked for. As mentioned in the next section, most of the language fallback resolution occurs at a lower level. However, only the  layer checks the database for overridden messages. Thus integrating overridden messages from the database into the fallback chain is done here. If not using the database, this entire layer can be disabled.
 * Checking for message overrides in the database
 * Caching over-ridden messages in memcached, or whatever  is set to
 * Resolving the remainder of the sequence

LocalisationCache
See

LCStore
The  class is merely a back-end implementation used by the LocalisationCache class for actually caching and retrieving messages. Like the  class, which is used for general caching in MediaWiki, there are a number of different cache types (configured using  ): The "file" option is used by the Wikimedia Foundation, and is recommended because it is faster than going to the database and more reliable than the APC cache, especially since APC is incompatible with PHP versions 5.5 or later.
 * "db" (default) - Caches messages in the database
 * "file" (default if  is set) - Uses CDB to cache messages in a local file
 * "accel" - Uses APC or another opcode cache to store the data

Licence
Any edits made to the language must be licensed under the terms of the GNU General Public License to be included in the MediaWiki software. Other extensions may be under different licences.

Old local translation system
With MediaWiki 1.3.0, a new system was set up for localising MediaWiki. Instead of editing the language file and asking developers to apply the change, users could edit the interface strings directly from their wikis. This is the system in use as of August 2005. People can find the message they want to translate in Special:AllMessages and then edit the relevant string in the  namespace. Once edited, these changes are live. There was no more need to request an update, and wait for developers to check and update the file.

The system is great for Wikipedia projects; however a side effect is that the MediaWiki language files shipped with the software are no longer quite up-to-date, and it is harder for developers to keep the files on meta in sync with the real language files.

As the default language files do not provide enough translated material, we face two problems:
 * 1) New Wikimedia projects created in a language which has not been updated for a long time, need a total re-translation of the interface.
 * 2) Other users of MediaWiki (including Wikimedia projects in the same language) are left with untranslated interfaces. This is especially unfortunate for the smaller languages which don't have many translators.

This is not such a big issue anymore, because translatewiki.net is advertised prominently and used by almost all translations. Local translations still do happen sometimes but they're strongly discouraged. Local messages mostly have to be deleted, moving the relevant translations to translatewiki.net and leaving on the wiki only the site-specific customisation; there's a huge backlog especially in older projects, [//toolserver.org/~robin/?tool=cleanuplocalmsgs this tool] helps with cleanup.

Keeping messages centralised and in sync
English messages are very rarely out of sync with the code. Experience has shown that it's convenient to have all the English messages in the same place. Revising the English text can be done without reference to the code, just like translation can. Programmers sometimes make very poor choices for the default text.

What can be localised
So many things are localisable on MediaWiki that not all of them are directly available on translatewiki.net: see translatewiki:Translating:MediaWiki. If something requires a developer intervention on the code, you can request it on Phabricator, or ask at Support if you don't know what to do exactly.


 * Namespaces (both core and extensions', plus gender-dependent user namespaces)
 * Weekdays (and abbreviations)
 * Months (and abbreviations)
 * Bookstores for Special:BookSources
 * Skin names
 * Math names
 * (for compatibility with old MediaWiki databases)
 * Default user option overrides
 * Language names
 * Country names (via )
 * Currency names (via )
 * Timezones
 * Character encoding conversion via
 * UpperLowerCase first (needs casemaps for some)
 * UpperLowerCase
 * Uppercase words
 * Uppercase word breaks
 * Case folding
 * Strip punctuation for MySQL search (search optimisation)
 * Get first character
 * Alternate encoding
 * Recoding for edit (and then recode input)
 * Get first character
 * Alternate encoding
 * Recoding for edit (and then recode input)
 * Fallback languages (that is, other more closely related language(s) to use when a translation is not available, instead of the default fallback, which is English)
 * Directionality (left to right or right to left, RTL)
 * Direction mark character depending on RTL
 * Arrow depending on RTL
 * Languages where italics cannot be used
 * Number formatting (comma-ify, i.e. adding or not digits separators; transform digits; transform separators)
 * Truncate (multibyte)
 * Grammar conversions for inflected languages
 * Plural transformations
 * Formatting expiry times
 * Segmenting for diffs (Chinese)
 * Convert to variants of language (between different orthographies, or scripts)
 * Language specific user preference options
 * and link prefix, e.g.:  These are letters that can be glued after/before the closing/opening brackets of a wiki link, but appear rendered on the screen as if part of the link (that is, clickable and in the same colour). By default the link trail is "a-z"; you may want to add the accentuated or non-Latin letters used by your language to the list.
 * Language code (preferably used according to the latest RFC in standard BCP 47, currently RFC 5646, with its associated IANA database. Avoid deprecated, grandfathered and private-use codes: look at what they mean in standard ISO 639, and avoid codes assigned to collections/families of languages in ISO 639-5, and ISO 639 codes which were not imported in the IANA database for BCP 47)
 * Type of emphasising
 * The extension has a special page file per language,   for language code.

Neat functionality:


 * I18N
 * Roman numeral formatting

Namespace name aliases
Namespace name aliases are additional names which can be used to address existing namespaces. They are rarely needed, but not having them when they are, usually creates havoc in existing wikis.

You need namespace name aliases:


 * 1) When a language has variants, and these variants spell some namespaces differently, and you want editors to be able to use the variant spellings. Variants are selectable in the user preferences. Users always see their selected variant, except in wikitext, but when editing or searching, an arbitrary variant can be used.
 * 2) When an existing wiki's language, fall back language(s), or localisation is changed, with it are changed some namespace names. So as not to break the links already present in the wiki, that are using the old namespace names, you need to add each of the altered previous namespace names to its namespace name aliases, when, or before, the change is made.

The generic English namespace names are always present as namespace name aliases in all localisations, so you need not, and should not, add those.

Aliases can't be translated on translatewiki.net, but can be requested there or on bugzilla: see translatewiki:Translating:MediaWiki.

Regional settings
Some linguistic settings vary across geographies; MediaWiki doesn't have a concept of region, it only has languages and language variants.

These settings need to be set once as a language's default, then individual wikis can change them as they wish in their configuration.

Time and date formats
Time and dates are shown on special pages and alike. The default time and date format is used for signatures, so it should be the most used and most widely understood format for users of that language. Also anonymous users see the default format. Registered users can choose other formats in their preferences.

If you are familiar with PHP's time format, you can try to construct formats yourself. MediaWiki uses a similar format string, with some extra features. If you don't understand the previous sentence, that's OK. You can provide a list of examples for.

Old edit window toolbar buttons

 * Not to be confused with the much more common 's "advanced toolbar", which has similar features.

When a wiki page is being edited, and a user has allowed it in their Special:Preferences, a set of icons is displayed above the text area where one can edit. The toolbar buttons can be set but there are no messages for it. What we need is a set of properly sized  files. Plenty of samples can be found in commons:Category:ButtonToolbar, and there is an empty button image to start off from.

Note, this can only be done when your language is already enabled in MediaWiki, which usually means a good portion of its messages have been translated; otherwise you must just wait, and have it done later.

Missing
'''This section is missing about the changes in the i18n system related to extensions. The format was standardised and messages are automatically loaded.'''

See #Message sources.