Content translation/Documentation/FAQ

What is the Content Translation tool?
It's a tool that helps editors create a new article based on a corresponding article about the same topic in a different language.

What is CX?
"CX" is an abbreviation for "ContentTranslation". It could be more intuitive to call it "CT", but this is already used for CategoryTree.

How does the Content Translation tool differ from Translate?
The Translate extension was initially built with focus on translating software user interface messages for MediaWiki and other programs. It can also translate MediaWiki pages, but experience shows that it's not so practical for translating articles of the kind that you can find in Wikipedia, Wikivoyage or similar sites: it requires adding markup to the source article to prepare it for translation, and it can mess things up if the source article changes drastically, as it often happens in Wikipedia. This works fairly well for documentation in mediawiki.org, meta and many other sites, but it doesn't scale for Wikipedia.

How can I access the tool?
You can find a very early testing version in Wikimedia Labs.

Is it available for all users of a wiki?
It will be initially available to logged-in users who enable it as a beta feature. The intention is to offer it to all users eventually.

Can it be used only to translate articles?
The focus for initial development is articles in Wikipedia and possibly Wikivoyage. It may be enhanced to articles in the style of other sites later.

Will there be special features to insert links and references from the original article?
Links will be automatically inserted using interlanguage links.

The software will make and effort to adapt references as much as it will be possible to do it between languages. This may be challenging to do fully automatically, given that different languages use different citation formats.

Will Content Translation use information from Wikidata?
Yes.

The earliest release will use interlanguage links from Wikidata to auto-fill the links in the translated article. There are plans to use labels, aliases soon laterwards.

It is likely that when templates in different Wikipedias will use data from Wikidata more, it will be simply picked up by ContentTranslation.

What are the translation aids that will be made available?
The current plan is:
 * Dictionaries: translation and definitions of words.
 * Link adaptation: Links will be adapted automatically when they will be available as interlanguage links to the target languages. It will be possible to make basic manipulation on them - remove them and pick them from other sources.
 * Translation memory: This is similar to what is used in the Translate extension.

How are you integrating Machine Translations?
For language in which machine translation is available, machine translation will be auto-filled upon clicking a paragraph in the translation area.

Initially we shall most likely use the Apertium engine, which is Free Software and can be installed and maintained on our own servers. At a later point we may use Moses and other engines.

Can the machine translated content be edited manually?
Yes.

We treat machine translation only as a tool that may help a human translator be faster. Publishing machine-translated articles is not the intention of ContentTranslation.

Will there be a feature to prevent bulk publishing of unedited machine translated text?
Yes!

We take language quality seriously. Machine translation is only a tool that helps the translator be more efficient, but the developers understand well that all translations must be edited by a human. The translation interface will show a warning if the translator will try to publish an article that only has machine translation. The developers will work with the editing communities to adjust this.

What dictionaries will be available?
The dictionaries will be initially taken from free dictionaries from the freedict project. Later other dictionaries may be added, such as Wiktionary, OmegaWiki, terminology collections, and possibly other open sites.

Can I copy images over from the source article?
Yes, this will be possible.

How will templates be handled? How are you handling infoboxes?
Initially, templates will be simply blacklisted - they will be shown in the source column of translation interface, but not copied to the target column. Many templates are project-specific, so it won't be possible to handle their translation at all. Some simple templates that have no parameters and do have a corresponding template in the target language will have

Will you provide suggestions from Translation Memory?
Yes.

We shall probably use the same technology that is used in the Translate extension.

The data for translation memory will have to be filled from some initial translations, so it may take a while from the time that translation memory is enabled for ContentTranslation until it becomes useful.

Can I set up the Content Translation extension on my local wiki?
Yes.

Just install the extension and follow the configuration guide. The default configuration has a bias for Wikipedia, so be sure to set it up correctly for your wiki.

What is cxserver?
ContentTranslation by definition works with multiple wikis and it needs to synchronize information between them, so it uses an additional component called "ContentTranslation server" or "cxserver" for short, to facilitate that. It also optimizes much of the connection to translation tools, such as dictionaries, machine translation, link adaptation, etc.