Mimihitam (talkcontribs)

Hello, I would like to request the removal of machine translation feature in Content Translation only for the Indonesian Wikipedia. I've complained here two times before about the horrendous quality of Yandex Translate. Since then, nothing has changed and more and more people are making articles using Yandex Translate without any adjustment. This results in me deleting dozens of articles, and this is not good for our community. Look at this: As I have predicted, someone will eventually come and say "then why did you provide for such feature in Content Translation?" Please, please, please, please, please, we can't put up with this Yandex nonsense anymore, so could you please kindly remove or disable it from the Indonesian Wikipedia? Thank you.

Pginer-WMF (talkcontribs)

Thanks for sharing your feedback, @Mimihitam.

I'm sorry to hear that the machine translation support provided is causing additional clean-up work for the Indonesian community. We have been making improvements in different areas that are relevant for this. Although those improvements are not visible at the moment, they will become visible soon.

Currently we are working on a new version of Content translation. The new version provides new mechanisms and more control to avoid machine translation to be left unreviewed: warning when there is too much unmodified text at a paragraph level so editors know which parts need review, a category for keeping track of those articles published despite the warning (this is the tracking category for Indonesian), and prevent translations from publishing when the amount of unmodified machine translation is really high.

We think that these mechanisms are a better option than disabling machine translation, since they provide access the tools to those users that use them properly, while preventing others from creating low-quality translations. We can adjust the thresholds, setting the percentage of unreviewed machine translation allowed in each case, but we need the community to be able to use the services and provide feedback for that.

The new version of Content Translation will become the default at the end of December. I'd propose as the next steps to evaluate if the problems persist with the new version, and increase the thresholds if there are still problems. Meanwhile we'll keep an eye on the Content translation statistics for Indonesian to keep track of the amount of deleted translations. For example, last week there were 3 deleted translations of 119 articles created with Content translation. This week the deletion rate seems to have increased so far to about 8% (90 published, 8 deleted) which we want to improve but does not look alarming.

In addition, we are always exploring ways to improve the machine translation support by integrating better services, and your feedback about the translation quality is very helpful for us. Apart from adding new services, using the existing ones also helps them to improve since user corrections made to the initial machine translation are exposed through a public API where anyone (including Yandex) can access them to improve their services or create new ones.


