地域化

From MediaWiki.org
Jump to navigation Jump to search
This page is a translated version of the page Localisation and the translation is 29% complete.
Other languages:
Deutsch • ‎English • ‎Türkçe • ‎dansk • ‎español • ‎français • ‎hrvatski • ‎magyar • ‎português • ‎čeština • ‎македонски • ‎русский • ‎فارسی • ‎ไทย • ‎中文 • ‎日本語
国際化の説明文書 地域化 · システムメッセージ · メッセージAPI · 言語 · translatewiki.net · 表記体系 · 書字方向
ショートカット:
I18N
I18n
L10n
ウィキメディア財団の地域化 (ローカライズ) チームについては、Wikimedia Language engineering を参照してください。
このウィキのページの翻訳については、Project:言語の方針 を参照してください。

このページではMediaWiki の国際化と地域化 (i18nおよびL10n) システムの技術面を取り上げ、コード執筆者なら覚えておくべきヒントを提示します。 Our mantra is that i18n must not be an afterthought: it's an essential component since the earliest phases of your software, as well as one of the core principles of MediaWiki.

Contents

リソースを翻訳する

translatewiki.net

translatewiki.net では、MediaWiki のコアと拡張機能および外装の全メッセージのウィキ内での翻訳がサポートされています。 もしあなたが翻訳に興味があって、このページ内で解説されている、ファイルの編集、Git、パッチの作成などの技術的な話題には関心がない場合、このまま translatewiki.net を訪問してください。

MediaWiki のユーザーインターフェイス メッセージの全翻訳は、translatewiki.net で提出されなければならず、コードに直接コミットすることは許されません。 ソースコード内にコミットしなければならないのは、英語のメッセージとそれに対応する初期の説明文のみです。

MediaWikiのコアと拡張機能は、ユーザーインターフェイス内に表示されるいかなるテキストでも、システム メッセージを使用しなければなりません。 システムメッセージの使用例を見たい場合は、Manual:特別ページ を参照してください。 拡張機能が適切に書かれていれば、スタッフが gerrit で確認してから数日以内に、translatewiki.net で翻訳できるようになるはずです。 しばらくしても表示されない場合は、担当者にお問い合わせください。 If it's too unstable to be translated, note so in the code or commit and contact them if necessary.

地域化システムの概要および地域化できる項目も参照してください。

メッセージの見つけ方

Help:システムメッセージ で翻訳したい特定の文字の探し方について説明しています。 特に、qqq トリック という機能をよく確認してください。MediaWiki 1.18 で導入された機能です。

国際化のメーリングリスト

メーリングリスト i18n を購読できます。 現在はメール量は少ないです。

コードの構造

最初に、Language.php に Language オブジェクトがあります。 このオブジェクトは、別の重要な言語特有の設定とカスタムの振る舞いと同様に地域化可能なすべてのメッセージ文字列 を含みます (大文字化、小文字化、日付の出力、数値の整形、書字方向 カスタム文法規則 など)。

オブジェクトは2つのソースから構築されます: それ自身 (クラス) のサブクラス化されたバージョンと Message ファイル (メッセージ) からです。

MediaWiki 名前空間を通してテキストの入力を取り扱う MessageCache class クラスもあります。 最近の国際化では、ほとんどの場合、Message オブジェクトと wfMessage() ショートカット関数 (includes/GlobalFunctions.php で定義されている) が使われます。 レガシー コードでは古い wfMsg*() 関数が使用されていることがありますが、現在では上記の Message オブジェクトに置き換えられているため、廃止予定とみなされています。

一般的な使用 (開発者向け)

Manual:メッセージAPI も参照してください。

言語オブジェクト

language オブジェクトを取得する方法には2つあります。 グローバル変数 $wgLang および $wgContLang が、それぞれインターフェイスとコンテンツの言語として使用できます。 For an arbitrary language you can construct an object by using Language::factory( 'en' ), by replacing en with the code of the language. You can also use wfGetLangObj( $code ); if $code could already be a language object. The list of codes is in languages/Names.php.

Language objects are needed for doing language-specific functions, most often to do number, time and date formatting, but also to construct lists and other things. There are multiple layers of caching and merging with fallback languages , but the details are irrelevant in normal use.

メッセージの使用

MediaWiki では、コード内のキーによって参照されるメッセージの、中央管理 リポジトリ方式を使用しています。 この方式は、例えば Gettext などが採用している、ソースファイルから翻訳可能文字列を抽出するだけの方式とは異なります。 キー ベースのシステムでは、翻訳元のテキストの改善や、メッセージの変更の追跡がより容易になります。 The drawback is of course that the list of used messages and the list of source texts for those keys can get out of sync. In practice this isn't a big problem, and the only significant problem is that sometimes extra messages that are not used anymore still stay up for translation.

To make message keys more manageable and easy to find, also with grep, always write them completely and don't rely too much on creating them dynamically. You may concatenate parts of message keys if you feel that it gives your code better structure, but put a comment nearby with a list of the possible resulting keys. 例:

// ここで使用できるメッセージ例:
// * myextension-connection-success
// * myextension-connection-warning
// * myextension-connection-error
$text = wfMessage( 'myextension-connection-' . $status )->parse();

PHP および JavaScript 内でのメッセージ関数の使用方法の詳細は、Manual:メッセージAPI を参照してください。

新しいメッセージの追加

関連項目: Localisation file format

Choosing the message key

関連項目: Manual:コーディング規約

The message key must be globally unique. This includes core MediaWiki and all the extensions and skins.

Stick to lower case letters, numbers and dashes in message names; most other characters are between less practical or not working at all. Per MediaWiki convention, first character is case-insensitive and other chars are case-sensitive.

Please follow global or local conventions for naming. For extensions, use a standard prefix, preferably the extension name in lower case, followed by a hyphen ("-"). 例外は以下の通りです:

  • Messages used by the API. These must begin with apihelp-, apiwarn-, apierror-. After this prefix put the extension prefix. (Note that these messages should be in a separate file, usually under i18/api.)
  • Log-related messages. These must begin with logentry-, log-name-, log-description.
  • 利用者権限。 The key for the name of the right as displayed on Special:ListGroupRights must begin with right-. The name of the action that completes the sentence "あなたには「$2」を行う権限がありません。理由は以下の通りです:" must begin with action-
  • Revisions tags must begin with tag-.
  • Special page titles must begin with special-.

Other things to note when creating messages

  1. Make sure that you are using suitable handling for the message (parsing, {{-replacement, escaping for HTML, etc.)
  2. If your message is part of core, it should usually be added to languages/i18n/en.json, although some components, such as Installer, EXIF tags, and ApiHelp have their own message files.
  3. If your message is in an extension add it to the i18n/en.json file or the en.json file in the appropriate subdirectory. In particular, API messages that are only seen by developers and not by most end users are usually in a separate file, such as i18n/api/en.json. If an extensions has a lot of messages, you may create subdirectories under i18n. All the message directories, including the default i18n/, must be listed in the MessagesDirs section in extension.json or in the $wgMessagesDirs variable.
  4. Take a pause and consider the wording of the message. Is it as clear as possible? Can it be misunderstood? Ask for comments from other developers or localisers if possible. Follow the #internationalisation hints.
  5. Add documentation to qqq.json in the same directory. Read more about message documentation.

翻訳するべきではないメッセージ

  1. Ignored messages are those which should exist only in the English messages file. They are messages that should not need translation, because they reference only other messages or language-neutral features, e.g. a message of "{{SITENAME}}".
  2. Optional messages may be translated only if changed in the target language.

To flag such messages:

既存のメッセージの除去

Remove it from en.json and qqq.json. Don't bother with other languages. Updates from translatewiki.net will handle those automatically.

既存のメッセージの変更

  1. Consider updating the message documentation (see #Adding new messages).
  2. Change the message key if old translations are not suitable for the new meaning. This also includes changes in message handling (parsing, escaping, parameters, etc.). Improving the phrasing of a message without technical changes is usually not a reason for changing a key. At translatewiki.net, the translations will be marked as outdated so that they can be targeted by translators. Changing a message key does not require talking to the i18n team or filing a support request. However, if you have special circumstances or questions, ask in #mediawiki-i18n or in the support page at translatewiki.net .
  3. If the extension is supported by translatewiki.net , please only change the English source message and/or key, and the accompanying entry in qqq.json. If needed, the translatewiki.net team will take care of updating the translations, marking them as outdated, cleaning up the file or renaming keys where possible. This also applies when you're only changing things like HTML tags which you could change in other languages without speaking those languages. Most of these actions will take place in translatewiki.net and will reach Git with about one day of delay.

Localising namespaces and special page aliases

Namespaces and special page names (i.e. "RecentChanges" in "Special:RecentChanges") are also translatable.

名前空間

Currently[1] making namespace name translations is disabled on translatewiki.net, so you need to do this yourself in Gerrit, or file a Phabricator: task asking for someone else to do it.

翻訳対象の自作の拡張機能でカスタムの名前空間を使用できるようにするには、以下のような内容の MyExtension.namespaces.php ファイルを作成します:

<?php
/**
 * MyExtension が導入した名前空間の翻訳。
 *
 * @file
 */

$namespaceNames = [];

// MyExtension 拡張機能がインストールされていないウィキのための処理。
if( !defined( 'NS_MYEXTENSION' ) ) {
	define( 'NS_MYEXTENSION', 2510 );
}

if( !defined( 'NS_MYEXTENSION_TALK' ) ) {
	define( 'NS_MYEXTENSION_TALK', 2511 );
}

/** English */
$namespaceNames['en'] = [
	NS_MYEXTENSION => 'MyNamespace',
	NS_MYEXTENSION_TALK => 'MyNamespace_talk',
];

/** Finnish (Suomi) */
$namespaceNames['fi'] = [
	NS_MYEXTENSION => 'Nimiavaruuteni',
	NS_MYEXTENSION_TALK => 'Keskustelu_nimiavaruudestani',
];

Then load the namespace translation file in MyExtension.php via $wgExtensionMessagesFiles['MyExtensionNamespaces'] = dirname( __FILE__ ) . '/MyExtension.namespaces.php';

Now, when a user installs MyExtension on their Finnish (fi) wiki, the custom namespace will be translated into Finnish magically, and the user doesn't need to do a thing!

Also remember to register your extension's namespace(s) on the extension default namespaces page.

特別ページの別名

See the manual page for Special pages for up-to-date information. The following does not appear to be valid.

Create a new file for the special page aliases in this format:

<?php
/**
 * Aliases for the MyExtension extension.
 *
 * @file
 * @ingroup Extensions
 */

$aliases = [];

/** English */
$aliases['en'] = [
	'MyExtension' => [ 'MyExtension' ]
];

/** Finnish (Suomi) */
$aliases['fi'] = [
	'MyExtension' => [ 'Lisäosani' ]
];

Then load it in the extension's setup file like this: $wgExtensionMessagesFiles['MyExtensionAlias'] = dirname( __FILE__ ) . '/MyExtension.alias.php';

When your special page code uses either SpecialPage::getTitleFor( 'MyExtension' ) or $this->getTitle() (in the class that provides Special:MyExtension), the localised alias will be used, if it's available.

メッセージのパラメーター

Some messages take parameters. They are represented by $1, $2, $3, … in the (static) message texts, and replaced at run time. Typical parameter values are numbers (the "3" in "Delete 3 versions?"), or user names (the "Bob" in "Page last edited by Bob"), page names, links and so on, or sometimes other messages. They can be of arbitrary complexity.

The list of parameters defined for each specific message is placed in special file "qqq.json" located in "languages/" folder of MediaWiki - read more in documentation.

It's preferable to use whole words with the PLURAL, GENDER, and GRAMMAR magic words. For example, {{PLURAL:$1|subpage|subpages}} is better than sub{{PLURAL:$1|page|pages}}. It makes searching easier.

メッセージ内の分岐…

Manual:メッセージAPI#性別、文法、複数形についての注記 も参照してください。

Parameters values at times influence the exact wording, or grammatical variations in messages. We don't resort to ugly constructs like "$1 (sub)page(s) of his/her userpage", because these are poor for users and we can do better. Instead, we make switches that are parsed according to values that will be known at run time. The static message text then supplies each of the possible choices in a list, preceded by the name of the switch, and a reference to the value that makes a difference. This resembles the way パーサー関数 are called in MediaWiki. Several types of switches are available. These only work if you do full parsing, or {{-transformation, for the messages.

…on numbers via PLURAL

Manual:メッセージAPI#性別、文法、複数形についての注記 も参照してください。

MediaWiki supports plurals, which makes for a nicer-looking product. 例:

'undelete_short' => 'Undelete {{PLURAL:$1|one edit|$1 edits}}',

If there is an explicit plural form to be given for a specific number, it is possible with the following syntax

'Box has {{PLURAL:$1|one egg|$1 eggs|12=a dozen eggs}}.'
Be aware of PLURAL use on all numbers
関連項目: Plural

When a number has to be inserted into a message text, be aware that some languages will have to use PLURAL on it even if always larger than 1. The reason is that PLURAL in languages other than English can make very different and complex distinctions, comparable to English 1st, 2nd, 3rd, 4th, … 11th, 12th, 13th, … 21st, 22nd, 23rd, … etc.

Do not try to supply three different messages for cases like "no items counted", "one item counted", "more items counted". Rather, let one message take them all, and leave it to translators and PLURAL to properly treat any possible differences of presentation for them in their respective languages.

Always include the number as a parameter if possible. Always add {{PLURAL:}} syntax to the source messages if possible, even if it makes no sense in English. The syntax guides translators.

Fractional numbers are supported, but the plural rules may not be complete.

Pass the number of list items as parameters to messages talking about lists

Don't assume that there's only singular and plural. Many languages have more than two forms, which depend on the actual number used and they have to use grammar varying with the number of list items when expressing what is listed in a list visible to readers. Thus, whenever your code computes a list, include count( $list ) as parameter to headlines, lead-ins, footers and other messages about the list, even if the count is not used in English. There is a neutral way to talk about invisible lists, so you can have links to lists on extra pages without having to count items in advance.

…on user names via GENDER

Manual:メッセージAPI#性別、文法、複数形についての注記 も参照してください。
'foobar-edit-review' => 'Please review {{GENDER:$1|his|her|their}} edits.'

If you refer to a user in a message, pass the user name as parameter to the message and add a mention in the message documentation that gender is supported. If it is likely that GENDER will be used in translations for languages with gender inflections, add it explicitly in the English language source message.

If you directly address the currently logged-in user, leave the user name as parameter empty:

'foobar-logged-in-user' => 'You said {{GENDER:|you were male|you were female|nothing about your gender}}.'
MediaWiki バージョン: 1.31
Gerrit change 398772

If you include the user name into the message (e.g. "$1 があなたに感謝を示しました。"), consider passing it through wfEscapeWikitext() first, to ensure that characters like * or ; are not interpreted.

Users have grammatical genders
関連項目 Gender

When a message talks about a user, or relates to a user, or addresses a user directly, the user name should be passed to the message as a parameter. Thus languages having to, or wanting to, use proper gender dependent grammar, can do so. This should be done even when the user name is not intended to appear in the message, such as in "inform the user on his/her talk page", which is better made "inform the user on {{GENDER:$1|his|her|their}} talk page" in English as well.

This does not mean that you are encouraged to "sexualise" messages' language: please use gender-neutral language whenever this can be done with clarity and precision.

…on use context inside sentences via GRAMMAR

Manual:メッセージAPI#性別、文法、複数形についての注記 も参照してください。

Grammatical transformations for agglutinative languages is also available. For example for Finnish, where it was an absolute necessity to make language files site-independent, i.e. to remove the Wikipedia references. In Finnish, "about Wikipedia" becomes "Tietoja Wikipediasta" and "you can upload it to Wikipedia" becomes "Voit tallentaa tiedoston Wikipediaan". Suffixes are added depending on how the word is used, plus minor modifications to the base. There is a long list of exceptions, but since only a few words needed to be translated, such as the site name, we didn't need to include it.

MediaWiki has grammatical transformation functions for over 20 languages. Some of these are just dictionaries for Wikimedia site names, but others have simple algorithms which will fail for all but the most common cases.

Even before MediaWiki had arbitrary grammatical transformation, it had a nominative/genitive distinction for month names. This distinction is necessary for some languages if you wish to substitute month names into sentences.

Filtering special characters in parameters and messages

The other (much simpler) issue with parameter substitution is HTML escaping. Despite being much simpler, MediaWiki does a pretty poor job of it.

メッセージについての説明文

There is a pseudo-language code qqq for message documentation. It is one of the ISO 639 codes reserved for private use. There, we do not keep translations of each message, but collect English sentences about each message: telling us where it is used, giving hints about how to translate it, and enumerating and describing its parameters, link to related messages, and so on. In translatewiki.net, these hints are shown to translators when they edit messages.

Programmers must document each and every message. Message documentation is an essential resource – not just for translators, but for all the maintainers of the module. Whenever a message is added to the software, a corresponding qqq entry must be added as well; revisions which don't do so are marked "V-1" until the documentation is added.

Documentation in qqq files should be edited directly only when adding new messages or when changing an existing English message in a way that requires a documentation change, for example adding or removing parameters. In other cases, documentation should usually be edited in translatewiki. Each documentation string is accessible at https://translatewiki.net/wiki/MediaWiki:message-key/qqq, as if it were a translation. These edits will be exported to the source repositories along with the translations.

Useful information that should be in the documentation includes:

  1. Message handling (parsing, escaping, plain text).
  2. Type of parameters with example values.
  3. Where the message is used (pages, locations in the user interface).
  4. How the message is used where it is used (a page title, button text, etc.).
  5. What other messages are used together with this message, or which other messages this message refers to.
  6. Anything else that could be understood when the message is seen on the context, but not when the message is displayed alone (which is the case when it is being translated).
  7. If applicable, notes about grammar. For example, "open" in English can be both a verb and an adjective. In many other languages the words are different and it's impossible to guess how to translate them without documentation.
  8. Adjectives that describe things, such as "disabled", "open" or "blocked", must always say what are they describing. In many languages adjectives must have the gender of the noun that they describe. It may also happen that different kinds of things need different adjectives.
  9. If the message has special properties, for example, if it is a page name, or if it should not be a direct translation, but adapted to the culture or the project.
  10. Whether the message appears near other message, for example in a list or a menu. The wording or the grammatical features of the words should probably be similar to the messages nearby. Also, items in a list may have to be properly related to the heading of the list.
  11. Parts of the message that must not be translated, such as generic namespace names, URLs or tags.
  12. Explanations of potentially unclear words, for example abbreviations, like "CTA", or specific jargon, like "template", "suppress" or "stub". (Note that it's best to avoid such words in the first place!)
  13. Screenshots are very helpful. Don't crop – an image of the full screen in which the message appears gives complete context and can be reused in several messages.

A few other hints:

  • Remember that very, very often translators translate the messages without actually using the software.
  • Most usually, translators do not have any context information, neither of your module, nor of other messages in it.
  • A rephrased message alone is useless in most circumstances.
  • Don't use designers' jargon like "nav" or "comps".
  • Consider writing a glossary of the technical terms that are used in your module. If you do it, link to it from the messages.

You can link to other messages by using {{msg-mw|message key}}. Please do this if parts of the messages come from other messages (if this cannot be avoided), or if some messages are shown together or in same context.

translatewiki.net provides some default templates for documentation:

  • {{doc-action|[...]}} for action- messages
  • {{doc-right|[...]}} for right- messages
  • {{doc-group|[...]|[...]}} for messages around user groups (group, member, page, js and css)
  • {{doc-accesskey|[...]}} for accesskey- messages

Have a look at the template pages for more information.

国際化のヒント

Besides documentation, translators ask to consider some hints so as to make their work easier and more efficient and to allow an actual and good localisation for all languages. Even if only adding or editing messages in English, one should be aware of the needs of all languages. Each message is translated into more than 300 languages and this should be done in the best possible way. Correct implementation of these hints will very often help you write better messages in English, too.

These are the main places where you can find the assistance of experienced and knowledgeable people regarding i18n:

Please do ask there!

Use Message parameters and switches properly

That's a prerequisite of a correct wording for your messages.

メッセージの使い回しの回避

翻訳者は使い回しの回避を推奨します。 This may seem counter-intuitive, because copying and duplicating code is usually a bad practice, but in system messages it is often needed. 2つの概念を英語で同じ単語で表現できる場合、それらがすべての言語で同じ単語で表現できることを意味しません。 「OK」がいい例です: 英語においてこれは一般的なボタンラベルに使用されますが、他の言語の一部ではボタンによって実行される操作に関連したボタンラベルを使うことが好まれます。 Another example is practically any adjective: a word like "multiple" changes according to gender in many languages, so you cannot reuse it to describe several different things, and you must create several separate messages.

If you are adding multiple identical messages, please add message documentation to describe the differences in their contexts. Don't worry about the extra work for translators. Translation memory helps a lot in these while keeping the flexibility to have different translations if needed.

「継ぎ接ぎ」メッセージの回避

言語には、さまざまな語順や複雑な文法的・構文的な規則があります。 It's very hard to translate "lego" messages, that is messages formed by multiple pieces of text, possibly with some indirection (also called "string concatenation").

It is better to make every message a complete phrase. Several sentences can usually be combined much more easily be into a text block, if needed. When you want to combine several strings in one message, pass them in as parameters, as translators can order them correctly for their language when translating.

メッセージ同士の相互引用

An exception from the rule may be messages referring to one another: 'Enter the original author's name in the field labelled "{{int:name}}" and click "{{int:proceed}}" when done'. This makes the message consistent when a software developer or wiki operator alters the messages "name" or "proceed" later. Without the int-hack, developers and operators would have to be aware of all related messages needing adjustment, when they alter one.

Don't use terms and templates that are specific to particular projects

MediaWiki is used by very diverse people, within the Wikimedia movement and outside of it. Even though it was originally built for an encyclopedia, it is now used for various kinds of content. Therefore, use general terms. For example, avoid terms like "article", and use "page" instead, unless you are absolutely sure that the feature you are developing will only be used on a site where pages are called "articles". Don't use "village pump", which is the name of an English Wikipedia community page, and use a generic term, such as "community discussion page", instead.

Don't assume that a certain template exists on all wikis. Templates are local to wikis. This applies to both the source messages and to their translations. If messages use templates, they will only work if a template is created on each wiki where the feature is deployed. It's best to avoid using templates in messages completely. If you really have to use them, you must document this clearly in the message documentation and in the extension installation instructions.

Separate times from dates in sentences

Some languages have to insert something between a date and a time which grammatically depends on other words in a sentence. Thus, they will not be able to use date/time combined. Others may find the combination convenient, thus it is usually the best choice to supply three parameter values (date/time, date, time) in such cases, and in each translation leave either the first one or last two unused as needed.

メッセージでの{{SITENAME}}の回避

{{SITENAME}} has several disadvantages. It can be anything (acronym, word, short phrase, etc.) and, depending on language, may need the use of {{GRAMMAR}} on each occurrence. No matter what, each message having {{SITENAME}} will need review in most wiki languages for each new wiki on which your code is installed. In the majority of cases, when there is not a general GRAMMAR configuration for a language, wiki operators will have to add or amend PHP code so as to get {{GRAMMAR}} for {{SITENAME}} working. This requires both more skills, and more understanding, than otherwise. It is more convenient to have generic references like "this wiki". This does not keep installations from locally altering these messages to use {{SITENAME}}, but at least they don't have to, and they can postpone message adaption until the wiki is already running and used.

Avoid references to visual layout and positions

What is rendered where depends on skins. Most often screen layouts of languages written from left-to-right are mirrored compared to those used for languages written from right-to-left, but not always, and for some languages and wikis, not entirely. Handheld devices, narrow windows, and so on may show blocks underneath each other, that would appear side-by-side on larger displays. Since site- and user-written JavaScript scripts and gadgets can, and do, hide parts, or move things around in unpredictable ways, there is no reliable way of knowing the actual layout.

It is wrong to tie layout information to content languages, since the user interface language may not be the page's content language, and layout may be a mixture of the two depending on circumstances. Non-visual user agents like acoustic screen readers and other auxiliary devices do not even have a concept of visual layout. Thus, you should not refer to visual layout positions in the majority of cases, though semantic layout terms may still be used ("previous steps in the form", etc.).

MediaWiki does not support showing different messages or message fragments based on the current directionality of the interface (see T30997).

The upcoming browser and MediaWiki support for East and North Asian top-down writing[2] will make screen layouts even more unpredictable, with at least eight possible layouts (left/right starting position, top/bottom starting position, and which happens first).

画面の色の参照の回避

The colour in which something is rendered depends on many factors, including skins, site- and user-written JavaScript scripts and gadgets, and local user agent over-rides for reasons of accessibility or technological limitations. Non-visual user agents like acoustic screen readers and other auxiliary devices do not even have a concept of colour. Thus, you should not refer to screen colours. (You should also not rely on colour alone as a mechanism for informing the user of state, for the same reason.)

Have message elements before and after each input field

This is a suggested guideline, has not become standard in MediaWiki development

While English allows efficient use of prompting in the form item–colon–space–input-field, many other languages don't. Even in English, you often want to use "Distance: ___ metres" rather than "Distance (in metres): ___". Leaving <textarea> elements aside, you should think of each and every input field following the "Distance: ___ metres" pattern. So:

  • give it two messages, even if the 2nd one is empty in English and some other languages, or
  • allow the placement of inputs via $i parameters.

メッセージ内の翻訳されないHTMLマークアップの回避

HTML markup not requiring translation, such as enclosing <div>s, rulers above or below, and similar, should usually not be part of messages. They unnecessarily burden translators, increase message file size, and pose the risk to accidentally being altered or skipped in the translation process. In general, avoid raw HTML in messages if you can.

メッセージは想定外の長さに達することが多い!

Skimming foreign language message files, you find messages almost never shorter than Chinese ones, rarely shorter than English ones, and most usually much longer than English ones.

Especially in forms, in front of input fields, English messages tend to be terse, and short. That is often not kept in translations. Especially genuinely non-technical third world languages, vernacular, mediæval, or ancient languages require multiple words or even complete sentences to explain foreign, or technical, prompts. For example, the brief English message "TSV file:" may have to be translated in a language as literally:

Please type a name here which denotes a collection of computer data that is comprised of a sequentially organised series of typewritten lines which themselves are organised as a series of informational fields each, where said fields of information are fenced, and the fences between them are single signs of the kind that slips a typewriter carriage forward to the next predefined position each. Here we go: _____ (thank you)

This is, admittedly, an extreme example, but you get the trait. Imagine this sentence in a column in a form where each word occupies a line of its own, and the input field is vertically centered in the next column. :-(

Avoid using very close, similar, or identical words to denote different things, or concepts

For example, pages may have older revisions (of a specific date, time, and edit), comprising past versions of said page. The words revision, and version can be used interchangeably. A problem arises, when versioned pages are revised, and the revision, i.e. the process of revising them, is being mentioned, too. This may not pose a serious problem when the two synonyms of "revision" have different translations. Do not rely on that, however. It is better to avoid the use of "revision" aka "version" altogether, then, so as to avoid it being mis-interpreted.

Basic words may have unforeseen connotations, or not exist at all

There are some words that are hard to translate because of their very specific use in MediaWiki. Some may not be translated at all. For example, there is no word "user" relating to "someone who uses something" in several languages. Similarly, in Kölsch the English words "namespace" and "apartment" translate the same word. Sticking to Kölsch, they say "corroborator and participant" in one word since any reference to "use" would too strongly imply "abuse" as well. The term "wiki farm" is translated as "stable full of wikis", since a single-crop farm would be a contradiction in terms in the language, and not understood, etc..

Expect untranslated words

This is a suggested guideline, has not yet become standard in MediaWiki development

It is not uncommon that proper names, tag names, etc. and computerese in English are not translated, and instead taken as loan-words, or foreign words. In the latter case, some particularly-fastidious translators may mark such words as belonging to another language with HTML markup, such as <span lang="en" xml:lang="en"></span>.

You may want to consider ensuring that your message output handler passes such markup along unmolested, despite the obvious security risks.

説明的なインライン マークアップを許容

This is a suggested guideline, has not yet become standard in MediaWiki development

Sometimes there are abbreviations, technical terms, or generally ambiguous words in target languages that may not be immediately understood by newcomers, but are obvious to experienced computer users. So as to avoid screen clutter of lengthy explanations without leaving newcomers stranded, translators may choose to add explanations as <abbr> annotations, shown by browsers when you move the mouse over them.

For example, the MediaWiki core message exif-orientation-8 about image rotation, which in English is simply "Rotated 90° CW", in Moroccan Arabic is translated as:

mḍwwer 90° <abbr title="Ĝks (ṫ-ṫijah) Ĝaqarib s-Saĝa">ĜĜS</abbr>

giving:

mḍwwer 90° ĜĜS

explaining the abbreviation for "counter clockwise" when needed.

You may want to consider ensuring that your message output handler passes such markup along unmolested, even if the original message does not use them.

必要な箇所で<code><var><kbd>のタグを使用

When talking about technical parameters, values, or keyboard inputs, mark them appropriately as such using the HTML tags <code>, <var>, or <kbd>. Thus they are typographically set off form the normal text. That clarifies their sense to readers, avoiding confusion, errors and mis-representations. Ensure that your message handler allows such markup.

記号、コロン、括弧などはメッセージの一部

Many symbols are localisable, too. Some scripts have other kinds of brackets than the Latin script has. A colon may not be appropriate after a label or input prompt in some languages. Having those symbols included in messages helps to make better and less Anglo-centric translations, and also reduces code clutter.

For example, there are different quotation mark conventions used in «Norwegian», »Swedish», »Danish«, „German“, and 「Japanese」.[3]

If you need to wrap some text in localized parentheses, brackets, or quotation marks, you can use the parentheses ($1) or brackets [$1] or quotation-marks "$1" messages like so:

wfMessage( 'parentheses' )->rawParams( /* text to go inside parentheses */ )->escaped()
wfMessage( 'brackets' )->rawParams( /* text to go inside brackets */ )->escaped()
wfMessage( 'quotation-marks' )->rawParams( /* text to go inside quotation marks */ )->escaped()

Do not expect symbols and punctuation to survive translation

Languages written from right to left (as opposed to English) usually swap arrow symbols being presented with "next" and "previous" links, and their placement relative to a message text may, or may not, be inverted as well. Ellipsis may be translated to "etc.", or to words. Question marks, exclamation marks, colons will be placed other than at the end of a sentence, not at all, or twice. As a consequence, always include all of those in the text of your messages, and never try to insert them programmatically.

終止符(句点)の使用

通常の文は終止符/句点 (。) で終了してください。 This is often the only indicator for a translator to know that they are not headlines or list items, which may need to be translated differently.

リンク アンカー

リンクのウィキテキスト

Link anchors can be put into messages in several technical ways:

  1. via wikitext: … [[a wiki page|anchor]] …,
  2. via wikitext: … [some-url anchor] …, or
  3. the anchor text is a message in the MediaWiki namespace. Avoid it!

The latter is often hard or impossible to handle for translators, avoid fragmented or 'patchwork' messages here, too. Make sure that "some-url" does not contain spaces.

Use meaningful link anchors

Take care with your wording. Link anchors play an important role in search engine assessment of pages – both the words linked, and the target anchor. Make sure that the anchor describes the target page well. Always avoid commonplace and generic words. For example, "Click here" is an absolute no-go,[4] since target pages are almost never about "click here". Do not put that in sentences around links either, because "here" was not the place to click. Instead, Use precise action words telling what a user will get to when following the link, such as "You can upload a file if you wish."

See also Help users predict where they are going and mystery meat navigation.

Avoid jargon and slang

Avoid developer and power user jargon in messages. Try to use a simple language whenever possible. Avoid saying "success", "successfully", "fail", "error occurred while", etc., when you want to notify the user that something happened or didn't happen. This comes from developers' seeing everything as true or false, but users usually just want to know what actually happened or didn't, and what they should do about it (if at all). So:

  • "The file was successfully renamed" -> "The file was renamed"
  • "File renaming failed" -> "There is a file with this name already. Please choose a different name."

1行に1文

This is a suggested guideline, has not yet become standard in MediaWiki development

Try to have one sentence or similar block in one line. This helps to compare the messages in different languages, and may be used as an hint for segmentation and alignment in translation memories.

Be aware of whitespace and line breaks

MediaWiki's localised messages usually get edited within the wiki, either by wiki operations on live wikis, or by the translators on translatewiki.net. You should be aware of how whitespace, especially at the beginning or end of your message, will affect editors:

  • Newlines at the beginning or end of a message are fragile, and will be frequently removed by accident. Start and end your message with active text; if you need a newline or paragraph break around it, your surrounding code should deal with adding it to the returned text.
  • Spaces at the beginning or end of a message are also likely to be removed during editing, and should be avoided. If a space is required for output, usually your code should append it or else you should use a non-breaking space such as &nbsp; (in which case check your escaping settings!)

標準的な大文字化の使用

Capitalisation gives hints to translators as to what they are translating, such as single words, list or menu items, phrases, or full sentences. Correct (standard) capitalisation may also play a role in search engines' assessment of your pages. MediaWiki uses sentence case (The quick brown fox jumps over the lazy dog) in interface messages.

Always remember that many writing systems don't have capital letters at all, and some of those that do have them, use them differently from English. Therefore, don't use ALL-CAPS for emphasis. Use CSS, or HTML <em> or <strong> per below:

強調

In normal text, emphasis like boldface or italics and similar should be part of message texts. Local conventions on emphasis often vary, especially some Asian scripts have their own. Translators must be able to adjust emphasis to their target languages and areas. Try to use "<em>" and "<strong>" in your user interface to allow mark-up on a per language or per script basis.

In modern screen layouts of English and European styles, emphasis becomes less used. Do convey it in your message documentation still, as it may give valuable hints as to how to translate. Emphasis can and should be used in other cultural contexts as appropriate, provided that translators know about it.

地域化システムの概要

地域化の更新

As mentioned above, translation happens on translatewiki.net and other systems are discouraged. Here's a high level overview of the localisation update workflow:

  • Developers add or change system messages .
  • Users translate the new or changed system messages on translatewiki.net.
  • Automated tools export these messages, build new versions of the message files, incorporating the added or updated messages, for both core and extensions, and commit them to git.
  • The wikis then can pull in the updated system messages from the git repository.

Wikimedia projects and any other wikis can benefit immediately and automatically from localisation work thanks to the LocalisationUpdate extension.[5] This compares the latest English messages to the English messages in production. If they are not the same, the production translations are updated and made available to users.

Once translations are in the version control system, the Wikimedia Foundation has a daily job that updates a checkout or clone of the extension repository. This was first established in September 2009.[6]

Because changes on translatewiki.net are pushed to the code daily as well, this means that each change to a message can potentially be applied to all existing MediaWiki installations in a couple days without any manual intervention or traumatic code update.

As you can see this is a multi-step process. Over time, we have found out that many things can go wrong. If you think the process is broken, please make sure to report it on our Support page, or create a new bug in Phabricator. Always be sure to describe a precise observation.

Handling support requests

Main page: translatewiki:Translating:Localisation for developers.

Translators may have questions about some of the messages you create. Translatewiki.net provides a support request system that allows translators the ability to ask you, the project owner, questions regarding messages so that they can be better translated. This short tutorial guides you through the workflow of handling translatewiki.net support requests.

メッセージのソース

Code looks up system messages from these sources:

  • MediaWiki 名前空間。 This allows wikis to adopt, or override, all of their messages, when standard messages do not fit or are not desired (see #Old local translation system).
    • MediaWiki:Message-key is the default message,
    • MediaWiki:Message-key/language-code is the message to be used when a user has selected a language other than the wiki's default language.
  • From message files:
    • Core MediaWiki itself and most currently maintained extensions use a file per language, named zyx.json, where zyx is the language code for the language.
    • Some older extensions use a combined message file holding all messages in all languages, usually named MyExtensionName.i18n.php.
    • Many Wikimedia Foundation wikis access some messages from the WikimediaMessages extension, allowing them to standardise messages across WMF wikis without imposing them on every MediaWiki installation.
    • A few extensions use other techniques.

キャッシュ

System messages are one of the more significant components of MediaWiki, primarily because it is used in every web request. The PHP message files are large, since they store thousands of message keys and values. Loading this file (and possibly multiple files, if the user's language is different from the content language) has a large memory and performance cost. An aggressive, layered caching system is used to reduce this performance impact.

MediaWiki はいくぶんかコードの理解を難しくする組み込みのキャッシング メカニズムをたくさん持っています。 Since 1.16 there is a new caching system, which caches messages either in .cdb files or in the database. Customised messages are cached in the filesystem and in memcached (or alternative), depending on the configuration.

The table below gives an overview of the settings involved:

キャッシュ格納領域の場所 $wgLocalisationCacheConf
'store' => 'db'
 
'store' => 'detect'
(既定)
'store' => 'files'
 
'store' => 'array'
(experimental since MW ≥ 1.26)
$wgCacheDirectory = false
(default)
l10n cache table l10n cache table エラー (パスが未定義) エラー (パスが未定義)
= path l10n cache table ローカル ファイルシステム (CDB) ローカル ファイルシステム (CDB) ローカル ファイルシステム (PHP 配列)
MediaWiki バージョン: 1.27.0 – 1.27.2
Gerrit #Id3e2d2

In MediaWiki 1.27.0 and 1.27.1, the autodetection was changed to favor the file backend. In case 'store' => 'detect' (the default), the file backend is used with the path from $wgCacheDirectory . If this value is not set (which is the default), a temporary directory determined by the operating system is used. If a temporary directory cannot be detected, the database backend is used as a fallback. This was reverted from 1.27.2 and 1.28.0 because of conflict of files on shared hosts and security issues (see T127127 and T161453).

Function backtrace

To better visually depict the layers of caching, here is a function backtrace of what methods are called when retrieving a message. See the below sections for an explanation of each layer.

  • Message::fetchMessage()
  • MessageCache::get()
  • Language::getMessage()
  • LocalisationCache::getSubitem()
  • LCStore::get()

MessageCache

The MessageCache class is the top level of caching for messages. It is called from the Message class and returns the final raw contents of a message. This layer handles the following logic:

The last bullet is important. Language fallbacks allow MediaWiki to fall back on another language if the original does not have a message being asked for. As mentioned in the next section, most of the language fallback resolution occurs at a lower level. However, only the MessageCache layer checks the database for overridden messages. Thus integrating overridden messages from the database into the fallback chain is done here. If not using the database, this entire layer can be disabled.

LocalisationCache

LocalisationCache を参照

LCStore

The LCStore class is merely a back-end implementation used by the LocalisationCache class for actually caching and retrieving messages. Like the BagOStuff class, which is used for general caching in MediaWiki, there are a number of different cache types (configured using $wgLocalisationCacheConf):

  • "db" (default) - Caches messages in the database
  • "file" (default if $wgCacheDirectory is set) - Uses CDB to cache messages in a local file
  • "accel" - Uses APC or another opcode cache to store the data

The "file" option is used by the Wikimedia Foundation, and is recommended because it is faster than going to the database and more reliable than the APC cache, especially since APC is incompatible with PHP versions 5.5 or later.

ライセンス

Any edits made to the language must be licensed under the terms of the GNU General Public License to be included in the MediaWiki software. Other extensions may be under different licences.

以前のローカル翻訳システム

With MediaWiki 1.3.0, a new system was set up for localising MediaWiki. Instead of editing the language file and asking developers to apply the change, users could edit the interface strings directly from their wikis. This is the system in use as of August 2005. People can find the message they want to translate in Special:AllMessages and then edit the relevant string in the MediaWiki: namespace. Once edited, these changes are live. There was no more need to request an update, and wait for developers to check and update the file.

The system is great for Wikipedia projects; however a side effect is that the MediaWiki language files shipped with the software are no longer quite up-to-date, and it is harder for developers to keep the files on meta in sync with the real language files.

As the default language files do not provide enough translated material, we face two problems:

  1. New Wikimedia projects created in a language which has not been updated for a long time, need a total re-translation of the interface.
  2. Other users of MediaWiki (including Wikimedia projects in the same language) are left with untranslated interfaces. This is especially unfortunate for the smaller languages which don't have many translators.

This is not such a big issue anymore, because translatewiki.net is advertised prominently and used by almost all translations. Local translations still do happen sometimes but they're strongly discouraged. Local messages mostly have to be deleted, moving the relevant translations to translatewiki.net and leaving on the wiki only the site-specific customisation; there's a huge backlog especially in older projects, this tool helps with cleanup.

Keeping messages centralised and in sync

英語メッセージがコードと同期されていないことはめったにありません。 経験によれば同じ場所にすべての英語のメッセージがあることは便利です。 英語のテキストの改訂は、翻訳と同様に、コードを参照することなく行うことができます。 プログラマーは時に既定のテキストのためにとても貧弱な選択肢を作成することがあります。

付録

What can be localised

So many things are localisable on MediaWiki that not all of them are directly available on translatewiki.net: see translatewiki:Translating:MediaWiki. If something requires a developer intervention on the code, you can request it on Phabricator, or ask at translatewiki:Support if you don't know what to do exactly.

言語のフォールバック図
  • フォールバック言語 (もし翻訳が未処理の場合に代替される言語で、既定のフォールバック言語つまり英語以外で、他の言語よりも関連性が高いもの)
  • 書字方向 (左横書き (LTR) または右横書き (RTL))
  • RTL (右横書き) に依存する書字方向マーク
  • RTL に依存する矢印
  • 斜体を使用できない言語
  • Number formatting (comma-ify, i.e. adding or not digits separators; transform digits; transform separators)[7]
  • Truncate (multibyte)
  • Grammar conversions for inflected languages
  • 複数形への変換
  • Formatting expiry times[clarification needed]
  • 差分用に分割 (中国語)
  • 言語の変種に変換(異なる正字法または文字種の間)
  • 利用者の個人設定で言語固有のオプション
  • Link trails and link prefix, e.g.: [[foo]]bar These are letters that can be glued after/before the closing/opening brackets of a wiki link, but appear rendered on the screen as if part of the link (that is, clickable and in the same colour). By default the link trail is "a-z"; you may want to add the accentuated or non-Latin letters used by your language to the list.
  • Language code (preferably used according to the latest RFC in standard BCP 47, currently RFC 5646, with its associated IANA database. Avoid deprecated, grandfathered and private-use codes: look at what they mean in standard ISO 639, and avoid codes assigned to collections/families of languages in ISO 639-5, and ISO 639 codes which were not imported in the IANA database for BCP 47)
  • 強調の種類
  • Cite 拡張機能には言語単位で対応する特別ページがあり、言語コードzyxに対してはcite_text-zyxとなります。

素晴らしい機能:

  • sprintfDate の国際化
  • ローマ数字の書式

名前空間名の別名

名前空間名の別名とは、既存の名前空間を指し示すために用いる名前として追加されます。 どうしても必要な場面は少ないですが、既存のウィキで必要なときに用意がないと通常、大きな混乱を引き起こしかねません。

以下の場合に、名前空間名の別名が必要かもしれません。

  1. 特定の言語に地域語があり、それぞれに綴りや用字の違いがある場合で、編集者がそれを使えるように設定したい場合。 表記違いは、利用者の個人設定で選択します。 ウィキテキストでは違いますが、利用者には常に選んだ表記方法で表示されるものの、編集や検索の場面では個人設定と異なる表記法が使えます。
  2. 既存のウィキで言語や (複数の) フォールバック言語あるいは地域化に変更があり、それに連動して名前空間名に変更があった場合。 そこで、ウィキにすでに置いてあるリンクで以前の名前空間名を使うものを壊さない対策として、新しい名前空間名の別名に古い名前空間名を追加する必要があります。この処理は前述の変更を行うタイミングでも、その準備段階でも実行できます。

地域化により、一般的な英語の名前空間名が必ず名前空間名の別名に登録されるので、わざわざ追加する手間はなく、また手動で追加するべきではありません。

別名はtranslatewiki.netの翻訳対象ではないため改めてリクエストするか、bugzillaで依頼します。詳細はtranslatewiki:Translating:MediaWiki#Namespace name aliasesをご参照ください。

地域の設定

地域によって、言語設定にばらつきがあります。MediaWiki にはさまざまな言語とその地域語があるだけで、「地域」という概念はありません。

これらの設定を1回だけ特定言語の既定として設定し、個別のウィキでは最適の言語にそれぞれの設定を変更できます。

日付と時刻の書式

日付と時間は特別ページその他で表示します。 既定の日付と時間の書式は署名に使われるため、その言語の話者がもっとも見慣れていてなじみのある形式にする必要があります。 また匿名利用者にも既定の書式で表示されます。 登録利用者は、日付と時間の表示形式を個人設定で指定できます。

PHPの time() 書式に詳しい利用者なら、書式の自作に挑戦することもできます。 MediaWiki では機能は少し多いものの、使用する書式は類似しています。 という説明がよくわからなくても、全然、問題ありません。 開発者 にこういうのがほしいというリストを示してください。

以前の編集窓のツールバーのボタン

より一般的なWikiEditor で使われる「拡張ツールバー」とは違います。機能は似ています。

特定のウィキページが編集され、利用者が特別ページ:個人設定で有効に設定すると、文章を記述し編集する領域の上部に一連のアイコンが表示されます。 ツールバーに代替のボタンを表示できるものの[1]、具体的な手順は示されていません。 適切なサイズに設定した一連の.pngファイルさえあれば、役に立ちます。 さまざまなサンプルを閲覧するにはcommons:Category:ButtonToolbarを参照でき(アーカイブ版)、素材として文字のないボタン本体の画像が使えます。

注記:この処理をするにはご使用の言語がすでに MediaWiki で有効に設定されていることが前提で、通常はメッセージ類のかなりの量が翻訳済みである状態を指します。もしまだその状態に達していないときは待つほかなく、後日、もう一度お試しください。

不足している事項

この節には、拡張機能関連の i18n システムの変更に伴い、不足する事項を扱います。 フォーマットを標準化し、メッセージは自動で読み込まれます。

#メッセージのソースを参照してください。

脚注

  1. https://gerrit.wikimedia.org/r/211677
  2. http://dev.w3.org/csswg/css3-writing-modes/
  3. w:Quotation_mark#Summary_table
  4. http://www.w3.org/QA/Tips/noClickHere
  5. Which works through the localisation cache and for instance on Wikimedia projects updates it daily; see also the technical details about the specific implementation.
  6. LocalisationUpdate update; LocalisationUpdate is live.
  7. These are configured by language in the respective language/classes/LanguageXx.php or language/messages/MessagesXx.php files.

関連項目