Content translation/Machine Translation/MinT/ja

MinT (Machine in Translation) is a translation service based on open source neural machine translation models. The service is hosted in the Wikimedia Foundation infrastructure and will be part of the list of machine translation (MT) systems available for users of Content Translation and other Wikimedia projects. The translations provided are based on NLLB-200 and OPUS translation models which have been optimized for performance using OpenNMT Ctranslate2 library in order to avoid the need for GPU acceleration. For more details you can check the source code, the API spec, and a test instance.



鍵となる機能

 * No nonpublic personal information of users is sent to MinT. The MT system will be accessed via an API. Article content (freely licensed) is sent to the MinT server and no direct communication is happening between the user and external services and no nonpublic personal information of users (IP, username) is sent to the MinT service. The client contacting MinT is open source and you can check it here. Although the MinT service is hosted in Wikimedia infrastructure, the integration follows the same pattern as other external services (please also see a diagram of this technical setup at the end of the section).
 * Any copyrightable information is returned from MinT under a free license. When MinT is used, a translated version of Wikipedia content is obtained. The copyrightability of such machine-generated content is an open legal question. To the extent that MinT translations are copyrightable, these translations are available under the same free license as the Wikipedia content being translated. Users can modify it and publish it as part of Wikipedia without conflicts with existing policies. The resulting content translated by MinT and the user modifications will be available under the same license that is used for the rest of the articles in Wikipedia.
 * Benefits the wider open source translation community. Translations obtained from MinT and user modifications will be publicly available. The post-edited translations are of special interest for the translation research community who can use this resource to create new translation services to support languages for which open source machine translation is not available yet. This will help developers create and improve machine translation systems.
 * Users can disable it. Automatic translation is an optional tool in Content Translation. Users have an option to disable it if they don't find it useful for some reason. Although many Content Translation users have requested for translation services, each individual user eventually decides whether they would like to use them or not.

center|500x500px



サービスに関する質問
We have addressed some immediate questions about MinT in this section. This is also available in the Content Translation FAQ page.



MinTで使用できる言語と追加の予定
MinT は複数の翻訳モデルを公開で提供するように設計してあります. そこで対応言語数もそれぞれ異なります. 利用できる機械翻訳（MT）システムの一覧には、最新の一覧を載せてあります.



MinTは他の機械翻訳システムと利用にどんな違いがありますか？
コンテンツ翻訳機能の利用者として、皆さんは翻訳インタフェースに大した違いを感じないかもしれませんが、MinT は対応言語ペアごとに Apertium その他のサービスと同じ形態で翻訳済みのコンテンツを表示します. 翻訳サービスには使用言語およびコンテンツの特性に応じて、それぞれの訳文の品質が異なります. そこで利用者の皆さんには試しに利用できるサービスを切り替えてみて、特定の段落に対してサービスが出力する訳文が最も優れたものを探してみてください.



MinTを使用すると機械翻訳の作業はどう進みますか？
ある利用者が記事の翻訳を開始すると、翻訳原文のそれぞれの節単位で HTML 形式のコンテンツが MinT に渡されます. MinT サービスは受け取ったリクエストを処理し、対応言語と設定に照らして翻訳モデルをどれか一つを採用します. 訳文のバージョンを取得するとコンテンツ翻訳機能の訳文用の縦枠に表示します. リンクや脚注は通常どうりに転用され、利用者は必要に応じてコンテンツを修正します.

この工程が翻訳対象の記事の節すべてに対して続きます. あらかじめ一続きの節をまとめて取り込み処理能力を向上させています. 利用者は通常どおり記事を公開するほか、（作業を後で再開するため）未公開の翻訳を保存できます. 記事は通常の記事同様、該当する権利の帰属とライセンスのもとにウィキペディアに公開されます.

手順を示す図はこちらをご参照ください.



MinTはオープンソースに基づいていますか？
MinTサービスはオープンソースであり、同様にオープンソースで入手できる以下のモデルを統合します.


 * The AI research team at Meta released the translation models used by NLLB-200 with an open source license and the dataset used for training as part of the No Language Left Behind project.
 * The OPUS project provides pre-trained neural translation models trained on OPUS data with an open source license.

これらのモデルはパフォーマンスの最適化にOpenNMT Ctranslate2 ライブラリを採用、これもオープンソースのライブラリです.

Content Translation evolved from a long-standing need to bridge the gap in the amount of content between Wikipedias in different languages. Like all other software used on Wikimedia sites, Content Translation is also open source. In this particular case as well, we are using an open source client to interact with the external service and import freely licensed content in order to help users expand our free knowledge. To use MinT we are not adding any proprietary software in the Content Translation code, or on the Wikimedia websites and servers.



MinTを使うときに自分の個人情報に危惧はありますか？
Irrespective of the service being used, you can be sure that only Wikipedia content from existing articles is sent and only freely licensed content will be added back to the translation. Communication with those services happens at the server side, so they are isolated from the user device and they have no access to nonpublic personal information of users. Please refer to this diagram for more details.



もしも機械翻訳ツールがMinTしか使えない状況なのに、これを使いたくない場合は？
Machine Translation is an optional feature in Content Translation that you can easily disable at will. If more machine translation systems are added for your languages, you can choose to enable MT again and select the MT service of your choice.



ウィキペディアでMinTの機械翻訳を使うのは無料ですか？
Yes. The content received from MinT is otherwise freely available on the web translation platform. For ease of use Content Translation receives it via an API to make it seamlessly available on the translation interface. This content can be modified by the users (if necessary) and used in Wikipedia articles under free licenses.



この内容は全般的な機械翻訳の改良に利用できますか？
Yes. Translations made in Content Translation are saved in our database. This information will be made publicly available for anyone to use as translation examples to improve their translation services (from University research groups, open source projects to commercial companies, anyone!). The content can be accessed via the Content Translation API. Please note, only information related to translated text is publicly available. This includes – source and translated text, source and target language information and an identifier for the segment of text.