Content translation/Machine Translation/Google Translate/ru

Поддержка машинного перевода для перевода контента теперь еще более расширена. В добавление к таким платформам как Apertium, LingoCloud, Matxin, Yandex и Youdao, мы добавляем Google Translate в список систем машинного перевода, которые доступны пользователю для перевода текста. Это позволит добавить поддержку машинного перевода для более чем 100 языков, включая новые языки, которые ранее были недоступны в существующих системах перевода.

Google Translate разработан многонациональной компанией Google,штаб-квартира которой находится в США. Различные команды Фонда Викимедиа и Google объединились, чтобы разработать соглашение, которое позволит использовать Google Translate не нарушая при этом принципы Викимедиа на присвоение прав, конфиденциальность данных пользователей, а также образ бренда. ''Вы можете ознакомиться с деталями соглашения. Мы будем рады услышать ваши вопросы, которые могут возникнуть при ознакомлении с данным сервисом.''

Ключевые особенности

 * Никакие персональные данные не будут доступны Google Translate. Доступ к использованию систем машинного перевода будет осуществлятся при помощи облачных технологий Google. Содержимое статьи (открыто лицензированное) будет отправлено с серверов Викимедиа на сервера Google. При этом не происходит прямого контакта пользователя с сторонними сервисами, а также персональная информация (IP-адрес, имя пользователя) не отправляется на сервера Google. Способ соединения пользователя с серверами Google находится в открытом доступе. Вы можете посмотреть код здесь. Никакая часть сервиса Google или его кода не будет частью инфраструктуры Викимедиа или кодовой базы перевода контента. Для лучшего ознакомления, посмотрите диаграмму технической структуры в конце раздела.
 * Информация полученная от Google Translate находится под действием свободной лицензии. При использовании Google Translate, переведенная версия содержимого статей Викимедиа охватывается свободной лицензией. Пользователи могут изменить и опубликовать полученную информацию как часть Википедии. При этом не будет противоречий с правилами пользования ресурсом. Полученный контент при помощи перевода Google Translate, а также правки пользователя будут доступны под такой же лицензией, которая используется для остальных статей в Википедии.
 * Приносит пользу широкому сообществу переводчиков. Переводы полученные при помощи Google Translate, а также пользовательские версии будут находиться в открытом доступе. Отредактированные человеком машинные переводы имеют особый интерес со стороны сообщества изучения переводов. Сообщество может использовать этот ресурс для создания новых сервисов по переводу поддерживающих языки, для которых машинный перевод с открытым кодом еще не доступен. Это поможет разработчикам создавать и улучшать системы машинного перевода.
 * Users can disable it. Automatic translation is an optional tool in Content Translation. Users have an option to disable it if they don't find it useful for some reason. Although many Content Translation users have requested for this translation service, each individual user may decide whether they would like to use it or not.

Google's obligations

 * Use of their Translation API key at no cost to the Wikimedia Foundation to allow volunteers on Wikimedia sites to translate articles and as many characters as needed.

Wikimedia Foundation's obligations

 * To provide the volunteer-edited versions of the text translated by the translation tool so that Google can improve their tool
 * No personal data of translators will be shared.
 * Just the original content to translate, its language, and translation target language will be sent in the request to Google.
 * The translations published by translators, with or without the help of machine translation services, will be provided in the form of parallel corpora by the Content Translation APIs. These APIs will be developed incrementally and results will be freely available for everyone, not just Google.

Important notes

 * All content will remain licensed under CC BY-SA 3.0
 * Google is not requiring any "branding" on Wikimedia Sites outside of listing Google Translate as a translation tool option in the translation interface drop-down menu
 * There is no exchange of personal information of volunteers
 * The agreement is limited to 1 year, at which time we can reevaluate our needs
 * We are free to terminate the agreement for any reason, at any time
 * Agreement is governed by U.S. law

Questions about this service
We have addressed some immediate questions about Google in this section. This is also available in the Content Translation FAQ page.

What languages are being handled by Google Translate? Are there plans to add more?
Google Translate can be used to translate into all available languages except English, which is currently not enabled to use machine translation of any kind.

How is using Google Translate different than using Apertium?
As a user of Content Translation, you will not feel any difference on the translation interface as Google Translate will display the translated content in the same way Apertium currently does for the supported language pairs.

How is machine translation being done if I choose Google Translate?
Google Translate provides an API key that allows websites and other services to use their translation system. Content Translation also uses that unique API key to access Google Cloud services for Google Translate. When a user starts translating an article, the HTML content of each section of the source article is sent to Google Translate and a translated version is obtained and displayed on the respective translation column of Content Translation. Links and references are adapted as usual and users can modify the content as required.

This process continues for all the sections of the article being translated. For better performance, the translations for consecutive sections are pre-fetched. The user can save the unpublished translation (to work on it again at a later time) or publish the article in the usual manner. The article is published on Wikipedia like any other normal article with appropriate attribution and licenses.

Here’s a diagram of the process.

Google Translate is not based on open source software. Why are we using it?
Content Translation evolved from a long-standing need to bridge the gap in the amount of content between Wikipedias in different languages. Like all other software used on Wikimedia sites, Content Translation is also open source. In this particular case as well, we are using an open source client to interact with the external service and import freely licensed content in order to help users expand our free knowledge.

To use Google Translate we are not adding any proprietary software in the Content Translation code, or on the Wikimedia websites and servers. The service is free of charge as part of Google’s offering to the Wikimedia Foundation.

Only the freely available Wikipedia article content (in segments) is sent to the Google Translate and the obtained translated content is freely usable on Wikipedia pages. The translated content can be modified by users and this data is also available publicly under a free license through the Content Translation API. This is a valuable resource made available for the community to develop open source translation services for those languages where they don't exist yet.

After studying the implications carefully, we found the fact that the content was stored previously in a closed source service does not limit the freedom of our knowledge or our software in the present or the future. We have taken special care to make sure that the content provided is freely licensed to make sure it complies with Wikipedia policies. This includes a long process for legal and technical evaluation and compliance. The summary of our agreement is also available above.

From user feedback, we have seen that machine translation support is really helpful for users and we want to support all languages in the best way. Guided by the principles of Wikimedia Foundation's resolution to support free and open source software, we will prioritize the integration of open source services whenever they are available for a language. Apertium has been a critical part of Content Translation since its inception, but currently, it only provides machine translations for about 30 of the numerous possible language combination that Wikipedia can support.

Should I be worried about my personal information when using Google Translate?
Irrespective of the service being used, you can be sure that only Wikipedia content from existing articles is sent and only freely licensed content will be added back to the translation. No personal information is sent and communication with those services happen at the server side, so they are isolated from the user device. Please refer to this diagram for more details.

What if Google Translate is the only machine translation tool available and I don't want to use it?
Machine Translation is an optional feature in Content Translation that you can easily disable at will. If more machine translation systems are added for your languages, you can choose to enable MT again and select the MT service of your choice.

Will the content translated by Google Translate be free for use in Wikipedia?
Yes. The content received from Google Translate is otherwise freely available on the web translation platform. Content Translation receives it via an API key to make it seamlessly available on the translation interface. This content can be modified by the users (if necessary) and used in Wikipedia articles under free licenses.

Can this content be used for improving machine translation systems in general?
Yes. Translations made in Content Translation are saved in our database. This information will be made publicly available for anyone to use as translation examples to improve their translation services (from University research groups, open source projects to commercial companies, anyone!). The content can be accessed via the Content Translation API. Please note, only information related to translated text is publicly available. This includes – source and translated text, source, and target language information and an identifier for the segment of text.