Help:Content translation/Translating/Translation quality/ja

翻訳版を作成するには、公開前に内容の校正が欠かせません. 当該の原文の意味を取り違えていないこと、また訳出した言語で自然に読めるかどうか確認しなければなりません. 先に機械翻訳にかけると翻訳作業のスピードアップに役立つ最初のコンテンツ〔訳注：一次翻訳〕は用意されますが、ツール利用者にはそのコンテンツを十分に再検討し編集するよう求められています.

翻訳者が一次翻訳を適切に編集できるよう、異なるメカニズムを用意しました. 翻訳編集機能により利用者がどの程度、一次翻訳を修正したか記録し、複数の限度に照らして公開を防止するか、利用者に警告を表示して内容をさらに吟味するよう促します.

機械翻訳をたくみに操る利用者に対し、ツールはこの手順で翻訳機能を提供し、再検討が足りない質の低い翻訳の作成を防止できます. 前述の限度の有効性の詳細は、各言語のニーズに適応させる調整法、ツールを使って作成したコンテンツの質の評価法を含め下記をご参照ください.

制限を設けて一次翻訳の再検討を呼びかける
コンテンツ翻訳は利用者が自動翻訳が提供する一次翻訳に加えた修正の割合を測定します. この方法は一次翻訳の翻訳において加筆・削除もしくは書き換えられた文字数をシステムに把握させます. これらの測定値は異なる2段階で取得します. 単文単位と訳文全体の2レベルです. どちらのレベルにも異なる制限が設定され、下記に詳細を示します.

訳文全体の制限
全文の 99% 超を機械翻訳のまま使おうとすると、公開できません. この制限は明らかに不正行為だとわかるものをブロックする方策です. 翻訳機能に単に文章を入れて、一次翻訳をまったく編集しないまま利用者が公開しようとしても防止します. 下記の説明のとおり、この制限は言語版単位で調整できます.

単文単位の制限
単文単位で一次翻訳に加えた修正の割合も測定されます. A paragraph is considered problematic when the paragraph contains more than 80% of the initial machine translation (or, when copying the contents from the source document, it contains more than 60% of unmodified content).

The translation editor will show a warning for each paragraph that is considered problematic, encouraging the user to edit it further. In some cases users are still able to publish, but the resulting page may get added to a tracking category of potentially unreviewed translations for the community to review. In other cases, users may not be allowed to publish.

These are some of the factors considered for determining whether to allow the user to publish or not (some of these are still in development):


 * The number of problematic paragraphs. Users are prevented from publishing translations with 50 or more problematic paragraphs. Users can still publish translations with less than 50 problematic paragraphs, but translations with 10 to 49 problematic paragraphs will be added to a tracking category of potentially unreviewed translations for the community to review.
 * Previous deleted translations. For users with some translations deleted in the last 30 days, the limits will be much more strict to prevent recurring problems. In those cases, translations with 10 problematic paragraphs or more will be prevented from publishing, while those with 9 or less problematic paragraphs will be added to a tracking category of potentially unreviewed translations for the community to review.
 * User confirmation. A less strict threshold is considered for paragraphs that users marked as resolved, as a signal that the user reviewed and confirmed the status of the translation. For paragraphs where the unmodified content warning was shown but the user marked it as resolved, we apply a less strict threshold (accepting 95% of Machine translation or 75% of source content). This will provide a way to accommodate cases where the automatic translation was exceptionally good, but still avoid potential abuse of the feature (i.e., not following blindly the user confirmation).

Contents not affected by the limits
Some contents are not expected to be edited significantly, and they are not considered when applying the limits described above. Very short section titles, citations, or the list of references are excluded from the checks. Otherwise, users may get misleading warnings because of contents such as the book titles in their references that they were not expected to translate.

制限の調整
The limits described above provide a set of general mechanisms, but they may need adjustments to the particular needs of each wiki. Based on initial evaluations, the amount of modifications needed to the initial machine translation can range from 10% to 70% depending on the language pair. On some wikis the default limits may be too strict, generating unnecessary noise or preventing perfectly valid translations from publishing. On other wikis, the limits may not be strict enough, allowing the publication of translations that are not edited enough.

Adjusting the different thresholds allows to make these limits more or less strict according to the needs of each wiki. Feedback from native speakers is essential to adjust the limits properly. If the current limits don't seem to work well based on your experience creating or reviewing translations, please share your feedback and we can explore how to better adjust them.

When providing feedback about adjusting the thresholds we recommend trying to create several example translations (make sure to check the publishing options if your test is not intended to be published as regular content). When testing how the limits work for your language, it is useful to keep in mind the following:


 * Check for both cases. Make sure to check how the limits work for translations where the content has not been edited enough and also for those where the initial translation has been edited enough. In this way you can more easily find the right balance for the limits. Checking only one type of problem can lead you to suggest moving the thresholds too far in the opposite direction.
 * Check different contents. Content in a wiki is very diverse, and machine translation may work much better for some cases compared to others. For example, content full of numeric data or technical names may require less editing by users than content with more descriptive text. Make sure to test with different articles, making translations of different lengths and involving different types of content.
 * Prepare to iterate. Adjusting the thresholds is an iterative process. It may require to make a custom adjustment to the thresholds or improve the general approach. In any case, after each change further testing may be needed to verify the improvements.

Adjusting the limits in collaboration with editors has proven to be effective. For example, initial results show that the Indonesian community has reduced significantly the number of problematic translations by restricting the publication of translations with more than 40% of unmodified machine translation. There is no automatic tool that is infallible, and these limits are not an exception.

The process of content review by the community is still essential, but these limits provide communities with tools to reduce the number of translations they have to focus on, making the review process much more effective. Please share your feedback and we can explore how to better adjust them.

Tracking potentially unreviewed translations
A tracking category with the name "cx-unreviewed-translation-category" is provided for the community to easily find the articles that were published with some content exceeding the recommended limits.

You can find this category in the list of tracking categories on each wiki. There you can find the articles that passed the limits that prevent publishing but had still some paragraphs that have been edited less than expected. For example the Indonesian category includes articles that have less than 40% of machine translation overall but have some paragraphs with more than 80% of unmodified machine translation.

翻訳品質の測定
コンテンツの品質評価の自動化は簡単ではありません. 削除率を目安にすると、生成されたコンテンツで編集者のコミュニティによって削除されなかったものは、その程度の質は備えていると推定できます. 削除率の分析によると、翻訳により作成された記事はゼロから書き起こした記事と比較すると、削除率が比較的低いのです. このことから、ほかの方法で記事を作成するよりも翻訳に対し、高い率で参加に制限を設けることは実際的ではないとも示唆されます.

コンテンツ翻訳機能を使った翻訳を公開するとcontenttranslation 編集タグが付き、「最近の更新」その他のツールを用いると、コミュニティにとってコンテンツ翻訳ツールを使った翻訳に集中できるようにしてあります. その上、公開された翻訳に関するデータならびに機械翻訳利用率の統計値は公開して誰でも解析できるようにしてあります.

利用者体験のその他の限界


ウィキにより質の低い翻訳作成を減らす方策として、利用者の権利によって翻訳に追加の規制を加えているところがあります. たとえば英語版ウィキペディアでは自動承認された利用者に制限し、英語版ウィキペディアで編集500回以上の経歴がないと、記事の翻訳が認められません. 編集初学者にも もしくは 名前空間に翻訳を公開することはでき、その後、記事名前空間に記事を移動することができます.

以上は、このページで説明する制限法ができる以前の話で、これらは、質の高い翻訳作成を促すお勧めの方法ではありません.

作成されるコンテンツに考慮しない制限を設ける前に、上記に説明したとおり、未編集のコンテンツを制限する手順を検討してください. その制限は質の低い翻訳予防のためいくらでも厳格にでき、編集者が良質な翻訳を作成して公開する道を残すことができます.