Help:Extension:Translate/Page translation administration

The page translation feature allows controlled translation of wiki pages into other languages. That means that the content of each translation will be equal to the source page. This is opposed to, for example, the different language version of articles in different Wikipedias, which usually are independent of each other. It is assumed that languages are only translated from one primary language to other languages, but translators can take advantage of other language translations too if they exist.

Without any help, translating more than a few pages into other languages becomes a time-waster at best, an unmaintainable mess at worst. With the page translation feature you can avoid the mess and bring structure to the translation process. The core idea is that the source text is segmented into smaller units, each which will be translated individually. When the source text is segmented into units, all changes can be isolated and translators only need to update the translations of units which have had changes in source text. It also enables translators to work on manageable size units, share the work between multiple translators or continue the translation in later sessions because they don't need to do all at once.

This page elaborates on the page translation tutorial by providing deeper insight how the system works and suggests best practices for wide variety of cases. This page is intended for page translation administrators and generally for everyone who edits the source text of translatable pages, even if they don't have the access to the administrative features of approving changes for translation. Development oriented things including known issues and future plans are documented at the page translation reference page.

Process overview

 * roles
 * pages get written
 * marked up
 * translated
 * changed
 * translations updated
 * discouraged, moved, deleted

The translatable parts of the source page must be marked with &lt;translate>-tags. There can be any number of such parts in one page as needed. The system will complain and prevent saving if it is unable to make sense of the result, for example if there is a missing close tag.

Once those parts are tagged and the page is saved, the system will detect that the page is prepared for translation. It will suggest marking the page for translation, for users which have the user right to do so. The system will split everything inside &lt;translate>-tags into sections. Each section is separated from the others with one or more empty lines. For this reason, each section roughly corresponds to a paragraph, and I may use the term paragraph interchangeably with translatable section. Once the user has accepted the suggested sections, the page is ready for translation. Users who have the user right to translate pages are now suggested to do so when viewing the source page or any of its translations.

In the mean time the source page can be changed freely, while translators keep translating the version that was marked for translation. Only after a user with the right to mark pages for translation has reviewed the changes and accepted the new version, will existing translations be marked as outdated if needed, and a new version of the translatable sections will be shown to translators. The page translation log will have entries for who marked which version of pages for translation, or who took pages out of translation.

Segmentation and markup

 * General principles
 * Headings
 * Images
 * Tables
 * Categories

Changing the source

 * General principles
 * Merging, splitting, deleting and adding
 * Reviewing and applying the changes

Special cases

 * templates
 * categories
 * file pages

Todo:
 * describe some edge cases, and ways of approach (f.e. userbase approach vs. twn approach, where userbase has "larger paragraphs including titles, and twn has as sections that are as small as possible preferably without any markup).
 * How to prepare a page for translation and manage translated pages [2]
 * What can or can't be done <- definitely describe template approaches (with /langcode). It's not great, but at least it works somewhat.
 * How to convert from previous systems (Meta-Wiki example?), considering different approaches as above. <- just a shitload of work... <- but needs to be correctly planned
 * Things not explained in the tutorial and left to this page:
 * changing names to units,
 * standard and custom splitting in units (by default triggered by two newlines etc.),
 * what's Fuzzy,
 * how discouraging works,
 * translation of templates and categories,[//meta.wikimedia.org/w/index.php?title=Terms_of_use&diff=3031223&oldid=3028647]
 * copyediting with the least translation invalidation,[//meta.wikimedia.org/w/index.php?title=Terms_of_use&diff=3201211&oldid=3201011]
 * how much parser functions and advanced wikitext the extension can bear,
 * what happens if the page has subpages and such (even with bug 33636 fixed),
 * how protection works.
 * including pages with /en

System administrators
It is assumed that the Translate extension is already installed and its basic configuration is done.

To enable the page translation feature, complete the following steps:
 * 1) Add   to your
 * 2) Rerun the installer or run   to create the new tables needed for this feature.

After that you should assign the right pagetranslation to a suitable user group. Users in that group can mark pages for translation. You should also have the translator group with right translate. For example:

If you have not already done so, it is recommended to install the following extensions:
 * mw:Extension:CleanChanges – hides extra translation edits from recent changes and provides more filters
 * mw:Extension:CLDR – provides localized language names in many languages

Page translation only setup
Copy the following configuration to only enable the Page translation feature, and no file based message groups. This will also not enable any machine translation connections or Translation Memory.

Content management
This documentation is for those whose responsibility it is to mark translatable parts of wiki pages and mark the pages for translation. Parts of it are useful for almost anyone, who are likely to see the markup in the page source. There may be dedicated translation managers or the responsibility may fall on the content writer's shoulders. Either way, this manual is for you and you should take the time to read it. First I will explain the syntax and how to deal with it, then the actual marking of pages for translation and finally explain how the syntax works and its limits.

Editing translatable pages
When a page is marked for translation, the software will add some marks to it. See the example below. These marks correspond to the named sections and they are used to keep track of each section.

&lt;translate>

Birds
&lt;!--T:1--> Birds are animals which....

&lt;!--T:2--> Birds can fly and... &lt;/translate>

The tags are always on the line before the section, or if it starts with a header, after the first header on the same line. The reason for different placement for headers is to keep section editing working as expected.

It is not recommended to add these marks manually or tamper with them. The software doesn't currently enforce their validity and sanity very rigorously. The best practices are:


 * If a section is moved, the section marker tag should be moved with it.
 * If a section is deleted, delete the section marker tag too.
 * If a new section is added, do not add a mark to it. The software will generate one for it when the page is marked for translation.
 * If the section content changes so much that old translations would no longer make any sense, delete the section marker tag.
 * When merging multiple sections, keep only one section marker tag.
 * When splitting a section into multiple sections, keep the section marker tag only on the first new section, or not at all.

When marking a new version for translation, wikitext differences are shown for each section. New and deleted sections are also listed. The person who marks can then do the final review, before creating new work for translators. Then all the translation pages are updated to match the new template. If there are many translations for the page, the updates are handled with the MediaWiki job queue and are not immediate.

Before marking the new version of the page for translation, ensure that the best practices above are followed, especially that translators get a new section if the content has changed. Also make sure that there are no unnecessary changes to prevent wasting translators time. If the source page is getting many changes, it may be worthwhile to wait for it to stabilize, and only after that push the work for translators.

The software does not check if a previously used section id is first made unused and taken into use again. These messages will show the difference like a changed message to translators. Unused section translations and translations pages are not cleaned up automatically, but that should not cause trouble.

Marking process
Once the page is ready and the translatable content is marked, it is time to put it up for translation. After the page is saved, users with the pagetranslation right will see a link at the top of page. It is also possible to go to Special:PageTranslation and select the page from the list.

At Special:PageTranslation you will see list of pages divided into groups according to their status. First you will see list of pages that are marked for translation. Pages with pending unapproved changes are highlighted. For those pages you can see the normal diff between last marked version and current version. Only latest version of a page can be marked for translation. In other groups you will see pages that can be marked for translation, and pages which are previously marked for translation, but latest version cannot be marked for translation.

Marking page for translation
If you choose the mark this version for translation, you will see an approval page. The page will contain the section division. It shows how the page is split into small, manageable parts. You can give each section a short name. By default it uses a running number from number 1 upwards. The names are permanent, so think up a good future compatible name if you choose one. Each section will have corresponding translation pages where the name is in the form Translation:Source page namespace:Source page name/section name/language code. These have to be unique and valid titles, so two sections cannot have the same name. Remember that spaces and underscores and even dots are considered the same character in the equality comparison. The main reason to give names is to keep Special:RecentChanges more readable if section page translations are not hidden with the CleanChanges extension. Translators will see the sections in the order they are in the page regardless of the section names.

For each section whose content has changed, there is a check-box that allows you to skip the normal process that invalidates translations. It is useful for spelling fixes, where the normal process would just waste translators' time in marking translations up-to-date without any actual changes.

At the bottom the translation page template and any changes to it are shown. All translatable sections are joined together by a translation page template. The template is the full page, where each translatable section is replaced with a placeholder.

Verify changes
Always review the changes carefully. Badly made section division and other errors can waste all existing translations for those sections if they need to changed later on. Consider also that when content of a section is changed, users viewing the translated page will see the old translation, which is highlighted as outdated. If this is highly undesirable, you can change the section name by editing the page content and removing the section marker. However by doing this you will lose all the old translations. Currently there is no way to show up the new contents while showing the old translation for translators when updating a translation.

When everything looks ok, use the form to mark the page for translation. If the page had not been previously marked, then it will appear on the usual places including Special:LanguageStats and the page itself will have a direct link to translate it. Users with the translate right can translate the page by translating each section individually. If there has been changes to the translation page template, all translation pages are updated to reflect changes in the new template.

Markup
It doesn't matter whether you create new pages or mark existing pages for translation. Marking a page for translation is a two step process. Once you have a page with content that you want to translate, you must mark the translatable parts by enclosing them within &lt;translate>...&lt;/translate> tags. Beware, these tags work differently from other tags, because they do not go trough the parser. This should not cause problems usually, but may if you are trying something fancy. In more detail, they are parsed before any other tags like &lt;pre> or &lt;source>, but after &lt;nowiki>. There can be multiple such pairs of tags in one page.

The text inside translate tags is split into translatable sections (roughly paragraphs). There must be at least one empty line between each section. It has simple whitespace handling: whitespace is preserved, except if a starting or ending tag is the only thing on a line. In that case the trailing newline character for starting tags is eaten, and similarly for the preceding newline character for the closing tag. This only means that they don't cause extra new lines in the rendered version of the page. If possible, try to put the tags on their own lines, with no empty lines between the content and the tags. Sometimes this is not possible, for example if you want to translate some content surrounded by the markup, but not the markup itself. This is fine too, for example:

It is possible to use variables similar to templates. The syntax for this is &lt;tvar|name>contents. For translators these will show up only as $name, and will automatically be replaced by the value in translation pages. This can be used for often changing non-linguistic content without the need for waiting for translators to update their translations every time it changes. You still need to mark the new version of the source page for translation, though.

Avoiding too much mark-up in the text makes it easier for translators to translate. The page translation feature places some restrictions on the text. There should not be any mark-up that will span over two or more sections. In other words, each paragraph should be standalone and be complete in isolation. This is currently not enforced in the software, but violating it will cause invalid rendering of the page, the severity depending on whether the resulting html is fixed by tidy or not.

It is also possible to use the tag to add a list of all translations of the page, with their completion and up-to-date percentages. Currently this is recommended, because there is no other indication that translations exist. You can use Special:MyLanguage/Pagename syntax to redirect users to translated page which matches their interface language. If there is no such translated version, they will see the original page. This feature has a small overhead, because browsers need to make one more request.

Translators
Page translation feature is usually restricted to one user group. Consult your local wiki how to become a translator. Once you are an approved translator, you can start translating. There are multiple ways to translate a page. If you are viewing a page that is marked for translation, you will see a link at the top of the content. That link will take you to the translation page. The target language is either your MediaWiki's interface language, or language of the page you were viewing if you came from translated version of a page. Other good place to find translatable pages is Special:LanguageStats, where you can see which pages are available for translation. At that page you also get an overview of which pages still need translation work.

The actual translating process does not differ from the usual one, about which you can read more at Translate extension user documentation. The following paragraphs highlight some mark-up issues that are mostly specific to page translation. It also gives suggestions for best practices how to handle them.

Apart from the usual issues in translation, you should pay attention to links. There are three kinds of links: links within the page, links to other pages and links to external sites.


 * Internal links
 * Internal links are of type  and refer to a header somewhere else on the same page.   should be exactly the same as the translation for the header, which may be in a different translatable section. Note that whitespace between the headers tags and the actual header is removed. There is a feature request to make it easier to have the link and the header translation match.


 * Links to other pages
 * Links to other pages are of type . For now we suggest to link to the source version (usually English) of the page, even if the target page can be translated too. There is a feature request to automatically go to the translated version if available, and show some other language if not.


 * Links to external sites
 * Links to external sites are of the form . Consult your own project's guidelines as to whether you should use a translated version of the resource if available, or not.

Note that you should always translate the link text, and usually you should add one if there isn't one already. Please consult your own project's guidelines for more detailed rules, if any.

Aside from links there may be other markup, like templates. Again, consult your own project's guidelines for how to handle those.