Help:Content translation/Translating/Initial machine translation

From MediaWiki.org
Jump to navigation Jump to search


When adding a new paragraph to the translation you can start from scratch or use an automatic translation as a starting point. When available, machine translation is used by default as an initial translation. The different options, the details about their availability, and the considerations when using machine translation are described below.

Options for the initial translation[edit]

The "initial translation" options in the tools column allow you to decide the initial content to use as a starting point for each paragraph. The options available are the following:

  • Use a machine translation service. This allows you to start with an automatically translated version f the original paragraph. The number and name of these options will vary. Options such as "Use Apertium" or "Use Yandex" will be available depending on the supported languages for these services (more on this on the next section).
  • Copy original content. The original paragraph will be copied over into the translation. Although content will remain in the original language, some elements are adapted to the target wiki. for example, links will point to the corresponding article in the target language, and templates will be converted into the equivalent ones. Translators still have to rewrite the content completely, but the adapted elements may be easier to reuse.
  • Start with an empty paragraph. Starting with an empty paragraph can be useful in cases where the alternative content requires more work than just typing it.

You can quickly switch between the different approaches independently on each paragraph, since each one may work best on different kinds of content. Switching between the different approaches preserves the changes you made on the paragraph. In this way, you can try a different approach even if you started editing the original one without the fear to lose your changes if you finally decide to go back to the original approach. Two additional options are relevant in this context:

  • The reset translation option is available when you made modifications on the initial content provided. It allows to restore the initial content by discarding the changes you made.
  • The mark as default option is available when you select an approach for a paragraph that is not the default. It allows to set the default approach for the next paragraphs that are added to the translation. This can be very convenient if you found that a particular translation service works generally better than the default one.


Machine translation availability[edit]

Content translation integrates several translation services, and each service support a different set of languages. The services supported are listed below with a link to the list of languages they support:

The list of languages above point to the configuration code to make sure that the information is in sync with the way the tool currently works. The list show the language code for the source language at the initial indentation level and the codes of all the supported target languages below it.

Language enablement is done in a gradual way based on the observed results and the community feedback. It is possible that machine translation has not been enabled yet, even if they are supported by the underlying services.

Expanding the language support[edit]

Content translation has been designed as an extensible platform. So it is possible to develop new clients to integrate additional translation services. Some considerations about the way translation services are integrated:

  • Machine translations and the user corrections made are publicly as part of the data on published translations, which can provide a useful resource to create or improve your translation service.
  • External services integrated only receive publicly available wiki content, and return a translated version of such content that is compatible with the licenses used in the wiki. No personal information is shared with the translation services.

Feedback on the support provided for each language is very useful. Please, let us know if you are missing support for some language, or whether higher quality options are available for it. You can provide such feedback on the project talk page or in this ticket.

Considerations on machine translation[edit]

Machine translation is far from perfect when intended as a final outcome. However, many users find it very useful as a starting point. Please make sure to review the content from these different perspectives:

  • Make sure the original meaning is preserved.
  • Check that there is no information missing, especially for elements such as links, references and templates that include information that is not always visible on the surface.
  • Read the translated content to make sure it reads natural as an independent page.

Limitations with complex elements[edit]

In some cases the content may not appear in the translation as expected:

  • Some of the services supported only work with plain text. This means that formatting and rich content elements such as links and citations from the original article are lost in the translation, and Content translation needs to guess where those belong in the translated text. Re-adding those elements is not always perfect and some elements may be in the wrong position or applied to the wrong part of the text.
  • Complex elements such as references or templates may use a different structure in each language, which makes it hard to transfer the content from one language into the other. Make sure to review the contents inside those elements to make sure there is no important information missing.

Enforcing the review of machine translation[edit]

Several automatic mechanisms exist to enforce the review of the initial contents. In this way, the tool makes sure that the initial automatic translation is reviewed enough before the contents get published.