Talk:Content translation

Jump to: navigation, search

About this board

Provide feedback about Content translation tool in this page.

We suggest to check the Frequently Asked Questions page first.

When reporting a bug, it will help us a lot if you will indicate the following things:

  • Which article were you translating and to which language
  • Which browser did you use (Firefox, Chrome, Microsoft Internet Explorer, Safari, Opera, etc.)

In case you are familiar with phabricator, please consider reporting a bug there.

See also:

By clicking "Add topic", you agree to our Terms of Use and agree to irrevocably release your text under the CC BY-SA 3.0 License and GFDL
Juandev (talkcontribs)

Why the translation from Czech to English doesnt work. I have Czech in both columns.

KartikMistry (talkcontribs)

Thanks for using Content Translation! Machine Translation to English is not available. But you can still use Content Translation to make translation easier with available tools like reference, image and link adaption.

Reply to "cs-en doesnt work"
Andrei Stroe (talkcontribs)

At this one: https://ro.wikipedia.org/wiki/Special:ContentTranslation?title=Special:ContentTranslation&campaign=contributionsmenu&to=ro&page=Georg+Cantor&from=en&targettitle=Georg+Cantor

It cannot load the translation, and a few paragraphs I had translated during the previous run are no longer there. Connection between paragraphs is also completely lost.

Sthelen.aqua (talkcontribs)

I had the same problem with de-ukr translation of de:Borkum - all my corrected paragraphs were lost and instead initial autotranslation appeared. I got mad and deleted the translation entirely.

Reply to "Data loss, again"
Triplecaña (talkcontribs)

I have a locked translation. It says a user is translating it. The problem is that the user only has one edit and has been inactive since March 2016 so there is no way to contact him. So, I guess I could delete the ongoing translation since I can't translate at all, but the delete option is not working. So I will always have that ongoing translation stuck there in the list. Could you please give an alternate workaround for this? Like if the user has been inactive for 3 months he or she loses all the ongoing translations... I don't know. Or at least a hard-delete option.

Reply to "Locked ongoing translation"
Triplecaña (talkcontribs)

From en-es

Bour studied at both the University and the Conservatoire of Strasbourg.

Bour estudió en la Universidad y el Conservatorio de Estrasburgo

When I tap mark as missing, the red wikilink is not translated. So it will stay as Conservatoire of Strasbourg but show Conservatorio de Estrasburgo. You have to mark as missing, then remove the link, and then manually add the internal link. This work great with people's names that need no translation, but in this case its tiresome. Please fix.

Reply to "Internal links"

Need a way to edit in source code mode before publishing

3
Fireattack (talkcontribs)

As title. The visual editor is fine for most of things but some can only be done or easily done in code mode.

Sometimes, errors in templates even prevent me to publish my translation, and without proper source code mode it's extremely hard to fix these errors, therefore I can't even "publish first then go to the source code mode to edit later".

Halibutt (talkcontribs)

A workaround: translate only the text, publish to your draft space, say User:Fireattack/Article title. Then open in source code, clean up and move. But other than that you're right, some easier debugging of the text would be great.

RammGmbH (talkcontribs)

Please, this would be a great addition, the visual editor should have some way to add code onto the translation.

Reply to "Need a way to edit in source code mode before publishing"
Joutbis (talkcontribs)

Hey,

For the last few weeks there has been a debate in the Catalan Wikipedia regarding the use of this translator. Right now, there is a bug in the generation of references (or templates) which nullifies the only "working feature" that remained about the tool. All the "cite web" references become the string "error in title or URL". Nothing remains of the original template. You can see an example in w:ca:Usuari:Oriololmo/Crisi_del_PSOE_de_2016.

Something we are seeing lately is the translation of internal links, where the link is correctly sent to an existent page, but the shown text remains in the original language. Sometimes, even the link is to the original wikipedia.

The quality of the language generated is certainly awful. It's hard to convey that in another language, but I will give you some examples:

  • "ARE mRNA" becomes "PLOUGH mRNA"
  • "Your NO counts" becomes "Your NO explains"
  • "This article is a stub [the term for short articles]" becomes "This article is a stub [like in the cigarettes]"
  • "The New York Times" becomes "The New York You swindle"
  • "The Time magazine" becomes "The he swindles magazine"
  • "Van Morrison" becomes "Van [as in the car type] Morrison"
  • "More Dirty Debutantes" becomes "Live in Dirty Debutantes"
  • "Who You Really Are" becomes "Who You Really let him Plough"

The English-Catalan translator is just painfully wrong, and many people don't bother to correct the errors. Catalan is quite similar to Spanish, so many people just stick to translating from this language. But the thing is, most of the examples above are from Spanish-Catalan translations, which people just don't double-check due to the mentioned similarities.

Sure, it is just a tool, and people are supposed to proofread the translations, but when this kind of nonsense is in many translations, you begin to wonder its purpose. Make no mistake, I wish we had an automatic translator which worked 100% of the time, and I get it, this is a tool which is being developed, but this kind of stuff just can't make it into Wikipedia in this current state. And again, some users *do* bother to correct those, mainly experienced ones. But we've also had way too many 'one-hit wonders' just come in, click on the paragraphs, publish the translation and leave Wikipedia. Heck, we've even had non-Catalan speakers translate into Catalan!

So, there has been an effort to move this kind of translations to the user namespace. But if we have to devote time to do this, it's time we're wasting on doing other stuff.

A user has come up with a proposal to temporarily disable the translator while the references/template bug is not fixed, and when it is solved, limit the usage of this translator to certain users. That could either be Autopatrolled users or a list of users who have proved they can properly use this tool in a similar fashion to that of Auto Wiki Browser.

Joutbis (talkcontribs)

OK, I reported two bugs and made a request:

  • references get lost
  • internal links are messed up and not translated
  • please let us restrict the tool somehow.

And not even an answer? Please, do something about it.

Amire80 (talkcontribs)

I actually know Catalan, so you can write in Catalan :)

Machine translation is never perfect. It must always be fixed. Users are not supposed to publish articles with machine translation without correction. If people do it, it's the same as vandalism, and can be deleted if needed. This is especially true for "one-hit wonders".

For problems with citations and templates—can you please give me examples of particular articles where this happened?

Townie (talkcontribs)

Sure, here you go: 624 articles to choose from.

Joutbis (talkcontribs)

As for the link problem, see w:Ca:Volcà de Colima and look for "Pompeya" and "Herculano". The link has the Catalan word and links to the Catalan entry, but the spelling is Spanish. Some administrator may move it to user-space soon, so watch out. You also have the article on top of this thread, with over 200 references, and almost all of them wrong.

If you understand Catalan, please check out the original discussion in the Catalan wikipedia, to see what's worrying us.

Mind you, it's not a matter of one-hit wonders and vandals. Brilliant wikipedians, with several featured articles on their belt, make mediocre articles when using the content translator. The overall quality goes south inevitably.

Endo999 (talkcontribs)

Google Translation is actually getting better for many of the main language pairs now, due to its shift to a deep learning (neural net) paradigm. I can say that since they did this they are getting better grammar in the French to English translations.

However, no machine language translation can rest by itself. It needs a person fluent in the destination language to massage it into the correct destination language grammar and meaning.

I have made a suggestion before, that people take out their own translation API keys and upload them to their preferences in their Wikipedia accounts. Thereupon, Wikipedia uses the translation engine of choice for the translator, if they wish to use machine translation. Since Google is probably the best service for many language pairs, this would allow Wikipedia to have the translator pay for this pay-for-use service. This would get around the strict Open Source policy of Wikipedia. Apertium is a noble attempt at Open Source translation, but most people would say it's not as good as Google translation, and not likely to be in the future either.

I think that attempts to unduly limit the use of machine translation in articles are actually attempts to slow the rate of translation between wikis. Already, translations into the enwiki are 1/10 that of translations into the cawiki. Wikipedia is about the increase of knowledge sharing, not limiting it.

Joutbis (talkcontribs)

I agree that Wikipedia is about knowledge sharing, and I have done myself quite a few translations from English and French into Catalan. But I don't see what the incomprehensible babble of Catalan words that the content translator is generating right now is doing for knowledge sharing. On the contrary, it tends to create large articles that no one can understand. If non-wikipedians see one of these articles, they will reach the conclusion that Wikipedia is no use, and they will be less likely to try it again. So wikipedians have to either chase and delete these articles, or try to fix them (going back to the original source to try to make sense, and spending a long time). I think our efforts would be better invested in creating better articles.

Perhaps translations into English are really getting better. Good for you. Into Catalan, they are still awful. Google, and non-Google.

This is not about limiting knowledge sharing. No way. It's about a computer application that's not fit for production use: the language is incomprehensible, and there are at least two important format errors. Other wikipedias may not have these problems, and that's great for them. But we want to have the choice and decide when and how we deploy this tool.

Pesky Catalans demanding to vote. Damn, it's becoming a pattern! :-)

Joutbis (talkcontribs)

Really, I don't see the point in trying Google Translate. If it's what you get in the interactive version at translate.google.com, it's not really worth it. It may be better than Apertium, if you say so, but it's still very far from acceptable.

I have seen a new development in the behavior of the content translator: please see w:Ca:Eli Lieb , on the last reference. There's a whole blob of HTML code (four nested div's!!!), and I don't know what it really means, but it does look like someone is testing software in a production environment.

There are probably a few examples, like w:Ca:España (diari) , although in this one, the editor had the presence of mind to edit all the junk out. He is in a select minority, mind you.

The thing is: given the irregular status of machine translation across different languages (some may have acceptable quality translators, some definitely don't), can you please give the individual wikipedias the choice whether to incorporate content translation or not?

Halibutt (talkcontribs)

@Townie, judging by the first article that appeared there (Abadia territorial de Santa Maria di Grottaferrata), the error is quite simple to fix: the correct template is there in the code (Ref-Web), but the names of the fields are left in the original language ("titolo=" instead of "títol", and so on.

Joutbis (talkcontribs)

This particular article was translated in 2015, and yes, it has a problem with the template. But this is not what this thread is about. The current problem is that the tool outputs just the string "error in title or URL", not the original template, not a translated template. And it has been going on for months. Please check the examples.

There are other errors reported in this thread, if anybody cares to look into them. But what we ask is for a way to limit the use of the content translation, because it's not ready for production, at least in Catalan.

Reply to "Limiting the use of the translator"

Error converting HTML to wikitext: docserver-http: HTTP 400

6
Sthelen.aqua (talkcontribs)

Hi!

When trying to publish translation to Ukrainian of English wiki article I keep getting this error: "Error converting HTML to wikitext: docserver-http: HTTP 400"

Please help!

Amire80 (talkcontribs)

In which article did this happen?

Sthelen.aqua (talkcontribs)

It happened to En:Jodrell Bank Observatory Jodrell Bank Observatory ( ). I'm sure I put the link to the article in my message and shocked to see it is not there. But just now after saving I saw that again the link is autodeleted here from my post.

Amire80 (talkcontribs)

The problem with the link is another bug: https://phabricator.wikimedia.org/T63725

Thanks for the article name, we'll check it soon.

Halibutt (talkcontribs)

Sorry to grave-dig, but the very same error happens when I try to publish my translation of Battle of Radzymin to my Polish draft space. No idea what to do about it. Confirmed it on both Firefox and Edge. Some 12 GB of RAM left, so it's not that problem either.

Sthelen.aqua (talkcontribs)

This error is a frequent visitor on this space lately. But almost everyone creates his own post for it. And it still has not being resolved.

I think the root of the problem might be in the autotranslation of template/infobox at the top of the article. Other articles (where I did not try to autotranslate infobox and did not touch it) were published just fine.

Reply to "Error converting HTML to wikitext: docserver-http: HTTP 400"
Halibutt (talkcontribs)

I have been translating my article on the Battle of Radzymin (1920) from English to Polish for quite some time now, returning to it every once in a while. Some time ago I clicked on the infobox, to copy it to the translated part and check whether translating infoboxes finally works. The infobox didn't get copied over, instead the three points on top keep blinking but nothing happens. Plus a grey overlay appeared on top of the entire translated side. Yesterday I could still click it and type my translation there (though clicking on templates or references didn't work), but later that night something happened and I can't click the translated part any more (or rather I can, but there's no cursor and I can't type anything).

Seems somehow related to the infobox, but frankly, I have no idea. Is there a debug log I could post? Here's the screenies:

Gouzoup (talkcontribs)

Oui j'ai eu le même problème, les infobox ne sont jamais traduites correctement et provoquent des bugs.

I had the same problem, infoboxes are never translated correctly and cause bugs.

Reply to "The engine breaks on infobox"

Bug with Content Translation tool - copying text to wrong section

1
Stinglehammer (talkcontribs)

Hi, User:EmilieKi has been translating Portrait_of_a_Lady_(van_der_Weyden) from English Wikipedia to German Wikipedia as part of a Translation Studes MSc assignment.

As the attached pdf file she sent me demonstrates the 'Notes' section was translated across successfully in page 1 BUT the Bibliography section in page 2 was not translated at all. Instead the 'Notes' section was translated again in its place. And this could not be rectified so the translation had two 'Notes' section and no 'Bibliography' section.

Reply to "Bug with Content Translation tool - copying text to wrong section"
Fringio (talkcontribs)

I don't really know if this is a bug or a feature, but why does ContentTranslator adds non-breaking space in the translated text? I mean, if I have this original text

"Batman is a superhero"

and i translate it in

"Batman è un supereroe"

since i have translated only "is a superhero" ContentTranslator adds a non-breaking space between "Batman" and "è un supereroe" because I haven't touched that space during the translation.

Amire80 (talkcontribs)

This is a known bug: https://phabricator.wikimedia.org/T119379

We'll try to take care of it.

Reply to "Non-breaking space"