Talk:Content translation

About this board

Please provide feedback about the Content translation tool on this page.

We suggest checking the Frequently Asked Questions page first.

When reporting a bug, it will help a lot if you will indicate the following things:

  • Which article were you translating and to which language
  • Which browser did you use (Firefox, Chrome, Microsoft Internet Explorer, Safari, Opera, etc.)
  • If you're getting any errors, please provide log from browser console if possible. To open browser console: press Ctrl+Shift+j (or cmd+alt+J () on a Mac).

In case you are familiar with Phabricator, please consider reporting a bug there.

See also:

Big issue from lij.wiki

7
N.Longo (talkcontribs)

Hi! I report a major problem with the Ligurian language wikipedia (and I think it may be common to other wikis aswell) that makes machine translation useless and with little possibility of improvement. On lij.wiki we want to reflect the great variability of our language and currently have pages in 19 macro-varieties of it (plus a number of sub-varieties) and, since Ligurian does not have a common writing standard, there are pages in at least 25 different ways of writing it. This, in addition to the very poor quality of the translated text, makes it completely useless because the AI produces a weird mixture of all the varieties and different spellings, it takes way longer to correct the machine translated text than to write it manually.

If it is possible to have a translator differentiated by variety/orthography, perhaps it could be of some use (I don't think so anyway), but as things stand, for an endangered language such as ours this is particularly harmful, with the serious risk that it will be used to translate texts in Ligurian being the only form of automatic translation available online (and with the results that can be seen).

For these reasons, considering also that lij.wiki and lij.wikisource are the largest collections of Ligurian language texts available online (and therefore I sense that they are widely used by the IA itself to produce texts), I strongly believe that it is much better to deactivate this function in our wiki, best regards.

Amire80 (talkcontribs)

Hi,

Thanks for the question.

Were any articles published with bad translation?

N.Longo (talkcontribs)

@Amire80 Hi and thanks for your quick answer. I noticed these problems testing the new feature after our wikipedia has been informed of it, I have not published any translation for the above reasons.

If you are interested in a specific example of this wrong logic, the AI uses the preterite, which has not been used in the Ligurian language for at least a century, probably because of the older texts on our wikisource (written with orthographies from that time, different from those we use!)

(hope I have made myself clear, I'm not so fluent in English)

Amire80 (talkcontribs)

Thanks for the clarification. I just wanted to make sure if the complaint is just about the output of the machine translation or also about bad published articles.

Can this machine translation output be at least somewhat useful to some people as a first draft of a translation that can be corrected?

Luensu1959 (talkcontribs)

Basically, I share the opinion of my colleague admin. However, as we control daily our Ligurian pages and we discourage people who do not speak the language from creating pages by writing randomly, I would leave the system active for some time and see if improvement can be made somehow. In the meantime we'll ask other people of our community.

Amire80 (talkcontribs)

Thanks for the positivity :)

N.Longo (talkcontribs)

@Amire80 I personally think not, since it is basically the same as using the translator as before AI considering all the changes you have to make to the text. Anyway, this is just my opinion, so I'll open a local discussion about this topic to get other feedbacks.

The main problem is there are few contributors on lij.wiki, if a user who does not know the language tries to create a page with automatic translation, it is likely that we'll have to delete it, which may lead to some unpleasant dynamics.

Reply to "Big issue from lij.wiki"

Visual editing style is too limiting and cofusing

4
Primium (talkcontribs)

I'm trying to translate a page from French to English, but the visual editing interface is way too limiting and confusing to use for anything more complex than editing a sentence. If there's any way to change to a "source editing" style, I can't seem to find it.

UOzurumba (WMF) (talkcontribs)

Hello Primium,

Unfortunately, you can't change to the "source editing" style when using the Content translation tool. You can only use it after you have published your translation, just like editing any article. Thank you!

LWChris (talkcontribs)

I would like to second on this - the translation tool should definitely get an option to edit in source mode. I do not trust the visual editor, and while it has some useful features, it feels so cumbersome to edit visually if you know the markup by heart.

Edward-Woodrow (talkcontribs)

Please, please, add a source mode. As @Primium: said, the visual editor is quite clunky- how do you even add headers?- and some templates are quite hard to manage. A source mode would make this so much easier, as it allows users to fine-tune formatting, etc. Edward-Woodrow (talk) 22:53, 5 June 2023 (UTC)

Reply to "Visual editing style is too limiting and cofusing"

Add Santali language to FLORES Supported languages

6
Rocky 734 (talkcontribs)

Is it possible to use Machine translation for Santali Wikipedia in Content Translation using NLLB-200 ?? I have read their blog and found Santali as a supported language among 204 languages. (https://github.com/facebookresearch/flores/blob/main/flores200/README.md).

Even if we get atleast one Machine translation tool with minor experimental or Intermediate suggestions it would be a great help. As, till now there is no Machine Translation tool for Santali language. It would help us translate more articles easily using Content Translation tool.

There is no objection in our community regarding adding of machine translation in Content Translation tool.

Pginer-WMF (talkcontribs)

Thanks for the feedback, @Rocky 734.

Santali is not supported right now, but we are exploring how to support more languages (especially those lacking any machine translaiton support).

Not all languages supported in NLLB-200 are available right now in Content Translation. The researchers developing NLLB-200 created and API to make their research available for the translation of Wikipedia articles in a small set of languages for evaluation purposes (more details in this page).

Now that the translaiton models have been released with an open license, we are exploring ways to expand the support to more languages in Wikipedia translation tools. Hearing about the need and interest form the Santali community is really useful for us to plan the next steps.

Thanks!

Rocky 734 (talkcontribs)

Thanks for the reply @Pginer-WMF , I'm clear now that not all languages are available for Content Translation. Playing with this new website to translate text from eng_latin to sat_Beng (sat_Beng is a wrong script code instead sat_Olck (https://github.com/facebookresearch/fairseq/pull/4576)) increased my confidence that the model is quite accurate and fair.


There is another project I remember, being a non-programmer tried to make eng-sat pair in Apertium better (https://beta.apertium.org/index.sat.html#?dir=eng-sat&q=blue%20house.%0A). It is still in beta I hope in future it may be integrated with CT tool. :-)


Website for testing NLLB:https://huggingface.co/spaces/Narrativaai/NLLB-Translator

Screenshot for NLLB: https://snipboard.io/5HLE60.jpg

Rocky 734 (talkcontribs)
Pginer-WMF (talkcontribs)

Hi @Rocky 734,

The Wikimedia Machine Learning team are exploring how to create an instance running NLLB-200 models, which will allow to support more languages from those supported by the model but not available yet thorough the current API such as Santali. You can check this ticket for more details and tracking progress. I mentioned the case of Santali to make sure it is captured in the ticket.


Thanks!

Rocky 734 (talkcontribs)

I apologize for the delay in my response. Thank you so much for the update.🙏🙏

Reply to "Add Santali language to FLORES Supported languages"

Please do not overwrite existing articles

1
Matthias M. (talkcontribs)

The feature seems to encourage authors to overwrite existing articles. de:WP:RC#Estetrol which got reverted.

Reply to "Please do not overwrite existing articles"

Change the needed percentage to allow publication

4
Klein Muçi (talkcontribs)

Hello!


In SqWiki we use the CX tool a lot to create new articles however the percentage of changed text needed to publish an article is making it very hard for us to utilize it lately. Google Translate has improved a lot recently and the translation it provides is most of the times very close to perfect. Many times we are literally forced to remove parts of the sentences just to spoof and bypass the change needed for publication, parts which are later introduced again once the article is published. Is there a way to change this?

UOzurumba (WMF) (talkcontribs)

Hello Klein Muçi,

If the machine limit is too strict for Albanian Wikipedia. In that case, you can discuss it with the community, considering different samples of the automatic translation as case studies and presenting it to them in the Village pump to agree on a suitable threshold by suggesting translation limits based on the accuracy of the automatic translation and how much translators edit the machine translation.

Once there is a consensus on the percentage of adjustment (e.g. 90% automatic translation and 10% adjustment), you can create a phabricator ticket like this one and ping me so I can inform the WMF Language team. I hope my reply helps you.

Thank you!

Klein Muçi (talkcontribs)

Okay, thanks a lot! I'll go through that process and notify you back on Phabricator.

Novice22 (talkcontribs)

Certain lists and unambiguations have little content other than references and brief descriptions. Translating such lists automatically usually is sufficient with automated translation. However, this produces an error of article being too short and translation too close to machine translation, prohibiting publication. Please, provide means to publish such articles or defined and clear ways to enhance them to your publication standards.

Additionally, errors and warnings cannot be seen, as clicking on the list of errors immediately expands and collapses the dropdown.

Reply to "Change the needed percentage to allow publication"

Published translation

1
Edouard Valensi (talkcontribs)

I just published a translation into French of a biography of a notable person. It is translated from the English version in Wikepedia. I received a notification congratulating me on my contribution. However, I cannot find it when I search by entering the exact title in the French Wikipedia. How long does it take for it to appear in the French version of Wikipedia?

Thank you, in advance, for you reply.

Reply to "Published translation"

appel à contribution pour traduction d'une notice

1
Achille Watergutt (talkcontribs)

qui pourrait, étant vieux wikipedien, activer la traduction en anglais de l'article "Mathilde Delattre" ? Merci par avance ! eric

Reply to "appel à contribution pour traduction d'une notice"

Translation for Santali Language

6
RIT RAJARSHI (talkcontribs)

In Santali Wikipedia (https://sat.wikipedia.org). It seems currently there is no machine translation feature, and it just adds the original paragraphs without any kind of translation. Unfortunately Google does not provide any kind of translation of this language. It would be great if an autotranslation feature exist for this language.

See also: https://sat.wikipedia.org/w/index.php?title=%E1%B1%9F%E1%B1%A5%E1%B1%9A%E1%B1%A0%E1%B1%9F%E1%B1%AD:ContentTranslation

Thanks in advance

RIT RAJARSHI (talk) 22:20, 12 September 2020 (UTC)

R Ashwani Banjan Murmu (talkcontribs)

Thanks @RIT RAJARSHI , for your suggestion. We will be initiating this process and definitely work on this. We will be in touch with you, looking forward for your support,

RIT RAJARSHI (talkcontribs)
Amire80 (talkcontribs)

@RIT RAJARSHI, there is no machine translation for Santali anywhere on the web, unfortunately. This is not something that the developers of Content Translation can do all by themselves.

If anyone ever develops a machine translation engine for Santali, we may try to integrate it here.

Until this happens, Content Translation is usable without machine translation and provides useful features: display of the source and the translation side-by-side, one-click image and link adaptation, some template adaptation, and more. Indeed, Content Translation was used to translate more than 90 articles into Santali, and I hope that you use it to translate more, even if machine translation is not available now :)

See these pages for further information:

RIT RAJARSHI (talkcontribs)
Reply to "Translation for Santali Language"
Qazrix99 (talkcontribs)

Estoy intentando la traducción al español y por mas que sintetizo, no consigo que quede aceptable para el programa @Mentxuwiki

~~~~

Reply to "Traduccion"
86.105.140.237 (talkcontribs)

Hello, it gives me an incorrect error when publishing the translation

UOzurumba (WMF) (talkcontribs)

Hello 86.105.140.237,

It will help a lot if you indicate the following things:

  • Which article were you translating, and to which language
  • Which browser did you use (Firefox, Chrome, Microsoft Internet Explorer, Safari, Opera, etc.)
  • If you're getting any errors, please provide the log from the browser console.

Providing the above information will help us understand the problem towards fixing it.

Thank you!

Reply to "Error"