Content translation/Machine Translation

From MediaWiki.org
Jump to navigation Jump to search

Content Translation is a tool for creating articles by translating them from other languages. The translation must be written by an editor, but for some languages machine translation is available as an additional tool.

Machine translation is not developed by the developers of Content Translation. All the machine translation engines are developed by other companies or organizations, and they are connected to Content Translation as external tools.

Machine translation can be disabled and enabled by each translator according to their preference. If machine translation is used, it is the translator's responsibility to edit the text and to make it grammatically and factually correct.

Machine Translation systems in use[edit]

  • LingoCloud - LingoCloud is provided Beijing ColorfulClouds Technology Co., Ltd, an internet services company from China. LingoCloud can be seamlessly used for translating Wikipedia articles within Content Translation via a publicly available API without compromising Wikipedia's policy of attribution of rights, privacy of our users and brand representation. Read more to know about this service.
  • Matxin - Matxin is provided by a team from the University of Basque Country, in collaboration with the Elhuyar Foundation. This open source machine translation system was created by this team primarily for their own use for writing Wikipedia articles in Basque, and Spanish. Matxin can be seamlessly used for translating Wikipedia articles within Content Translation via a publicly available API without compromising Wikipedia's policy of attribution of rights, privacy of our users and brand representation. Read more to know about this service.
  • Yandex - Yandex.Translate is provided by Yandex – a Russian internet services company. Yandex.Translate can be seamlessly used for translating Wikipedia articles within Content Translation via a publicly available API without compromising Wikipedia's policy of attribution of rights, privacy of our users and brand representation. Read more to know about this service.
  • Youdao - Youdao is provided by NetEase – an internet services company from China. Youdao can be seamlessly used for translating Wikipedia articles within Content Translation via a publicly available API without compromising Wikipedia's policy of attribution of rights, privacy of our users and brand representation. Read more to know about this service.

Extending machine translation support in Content Translation[edit]

Machine translation support is an important part of the work-flow of Content Translation tool. Even with limited coverage of languages, usage of machine translation has been widely adopted and on several occasions users have reported an increased efficiency. Initially Apertium was the only machine translation (MT) system that was available with Content Translation, serving more than 30 languages. From 4th November 2015, Yandex machine translation system was also made available, initially for users of Russian Wikipedia and later extended to Armenian, Albanian, Bashkir, Persian, Polish, and Uzbek.

Both Apertium and Yandex have been steadily increasing support for more languages, thus improving the potential of efficiency that the Content Translation tool can provide. In turn, Content Translation includes these changes so that more languages can benefit from the integrated machine translation services. In the following sections we outline the general process that is followed by the WMF Language team to update Content Translation settings to provide machine translation for languages as they are made available. This is an ongoing process and open to changes as per individual needs of each wiki where the tool is used.

Multi-step enablement and feedback process[edit]

Machine Translation is an optional feature in Content Translation. Users can easily choose between available machine translation systems or even disable it at will. It can be selected using the dropdown selection box available in Content Translation.

Machine translation selection dropdown menu inside Content Translation
  • Enabling machine translation for a language: Available machine translation system for particular language or language pair is enabled as a non-default option. Users can choose it through the drop-down menu on the interface. (see image)
  • Notification: Content Translation users will be notified about newly added machine translation support for the language they are translating to. Using the drop-down menu they can choose to try it. As part of the regular updates from the development team, this change will also be communicate to the local village pump of the wiki for the particular language.
  • Feedback and monitoring: After machine translation is enabled for use, the development team will be gathering feedback via the usual communication channels (talk page, phabricator etc.) and collecting relevant data to better assess users’ needs and identify any anomalies/bugs.
  • Advancement or reassessment: Depending upon the usage trends, benefits, bugs or other issues and concerns, the next step would be to enable the MT system as the default option for a language so that users need not have to go through an extra step to select the MT system every time they start translating. The option to not use MT would still be available for users. However, in case there are deficiencies or bugs that prevent normal usage of Content Translation, the service would be re-examined or suspended as per individual needs of the wiki.

More information[edit]