Help talk:Extension:Translate/Page translation administration

From mediawiki.org
Jump to navigation Jump to search

Example templates made by the community:

--Nemo 19:35, 22 March 2013 (UTC)

Lists[edit]

Short messages make it more likely getting a hit from the translation cache. That breaking a sentence is not okay, is not hard to understand but why this example?

<translate>
* General principles
* Headings
* Images
* Tables
* Categories
</translate><translate>
* Links
* Templates
</translate>

Why not making each point a single translation unit without giving the translator the option forgetting to include the bullet point, which is markup, IMHO and therefore should not be part of a translation unit:

* <translate>General principles</translate>
* <translate>Headings</translate>
* <translate>Images</translate>
* <translate>Tables</translate>
* <translate>Categories</translate>
* <translate>Links</translate>
* <translate>Templates</translate>

The reason you are using bullet-lists is that each point usually does not rely on another one. They can be freely moved around. That's why the HTML output is an <ul>-element = unordered list. -- Rillke (talk) 15:25, 30 October 2013 (UTC)

It seems to me that this is exactly what the page currently says. :) It can only afford one example so it makes a middle way/generic one. Sometimes the whole list has to be in one unit, sometimes it's better to leave the bullet inside the unit for clarity, etc. --Nemo 15:33, 30 October 2013 (UTC)
I still prefer the first markup which is much simpler (only one pair of translate tags) and preserve the fact that they ar list items.
However when marking the page, the Yranslate tools should isolate each item in a separate item like this:
<translate>
* General principles<!--T:1-->
* Headings<!--T:2-->
* Images<!--T:3-->
* Tables<!--T:4-->
* Categories<!--T:5-->
</translate><translate>
* Links<!--T:6-->
* Templates<!--T:7-->
</translate>
i.e. Treat list item (either bulleted and unordered, or numbered and ordered) as well as as definition lists (and indented blocks) based on the prefixes "*", "#", ";" and ":" as separate paragraphs that make a separate translation. Treat them exactly like "==Headings==". Make sure that this separate blocks are kept into separate translation unit as they are never the same paragraph (though they may still be phrases in the same full sequence, even when their final punctuation is often dropped when it is an implied full stop for non-verbal sentences).
This would facilitate a lot the translation of long lists (without having to scroll vertically in the translate tool while looking at the source language at top but filling translation at bottom), and would allow the source to insert/remove some items or reorder them and wtill benefit from the translation memory.
Verdy p (talk) 02:29, 24 January 2014 (UTC)
The code to modify to support lists' (as well as every wiki-syntax that is dependent on the presence of some specific characters at begining of lines) is in
[mediawiki/extensions/Translate/tag/TPSection.php] with the regexp defined in line 52,
which only matched only the equal sign "=" but forgets to parse lines starting by "*#;:", plus lines starting by spaces (preformatted text) and lines starting by "{| to locate tables, and lines within tables starting by "|" or "!".
These lines, when they are found within a translate section, must not be broken without care (newlines found in replacable translated items will break the generated code (note that within tables, so the content of the translate section should be split (exception: within tables, only the first line must be split, the rest is the cotent of a cell, which may be full paragraphs, or another embedded table or list)
Also the parser should detect the presence of "noinclude" sections and HTML comments, which should be treated like "tvars" (their content should also be parsed as separate units, taking into account the fact that they start on a special line so their first line is also unbreakable, with the exception of HTML comments; if there are tags, the tags may span several lines for its attributes).
Also the parser should detect the presence of double opening braces and should then start counting them because they will also allow breaks in their content. In fact it's not easy to manage all the combinations: it requires a full wiki parser to detect the boundaries. But basic lists should still work.
The code also does not because correctly if "==headings==" are embedding newlines (for example with the presence of a an span element or any other inline HTML element, or a template transclusion, defining for example an id attribute, or HTML comments, as this inserted code may also be internally on multiple lines: such inserted code should be handled like a "tvar" (the name of the tvar should be generated, and could be $1, $2, ... or _1, _2, ...)
Once they are split, each of these line should get their own separate "T:id" by becoming a separate unit. In summary, a single line of wiki code would be split in multiple units where they embed newlines internally). For now this parser code is too much optimistic on the expected valid contents of a translate pseudo-element. But at least, it should be able to handle the simple cases of lists and preformated text (possibly also the <poem>...</poem> elements which also preserve newlines in their content).
Verdy p (talk) 13:21, 24 January 2014 (UTC)

Proposal: adding T:ins and T:del pseudo-elements[edit]

Something missing: I's like to have the following syntaxes supported and recgnized by the tool when marking pages:

<!--T:ins>some wiki text to insert into translated subpages, but ignored when rendering the base source page</-->
<T:del>some wiki text to remove in translated subpages, but rendered in the base source page</>

These two kinds of pseudo-elements could be outside of translate tag pairs, but if they are, the "T:ins" would be visible as "$placeholders" in translation units, and the "T:del" would be invisible (they are only in the source, never in translations).

Notes
  • Insertions are placed within special HTML comments in the source, but not the deletions; see below why.
  • But if these pseudo-tags are integrated in the MediaWiki parser as regular tags the tags and contents of T:ins will be silently discarded, but only the "T:del" open and close tags will be discarded, preserving their content.
  • It will no longer be necessary to mask the "T:ins" into pseudo HTML comments so the "T:ins" pseudo-element would become simply <T:ins>...</> (if using the short close tag) or <T:ins>...</T:ins> (more HTML- and XML-friendly, and allows easier parsing without counting open tags, notably if there are other competing MediaWiki extensions needing their own tags as well).
  • At the same time the special HTML comments for "T:id" used in source pages should become an "id=" attribute of the open "translate" tag (this would avoid cluttering the translation source pages with many line breaks and the many T:id special comments).
  • The usage of abbreviated close tags (already used for "tvar" pseudo-elements) should be deprecated using regular </tag> instead of just </>, because they may be ambiguous or they may need to be counted if "tvar"s are mixed within the contents of "T:ins" or "T:del" pseudo-elements.
  • The "translate" and "tvar" tags should also be integrated as regular tags, so that they are silently discarded when transcluding a page.
  • Later MediaWiki could intepret all these tags directly to perform what {{TNT}} currently performs (i.e. locating a translation subpage during template transclusion, and resolving language fallbacks): Media-Wiki should also integrate the "Special:MyLanguage/" prefix so that they become directly usable instead of using the utility {{TNT}} or this utility template would no longer need a Lua module but would simply add this special page prefix to the invoked template.

Commented example of usage of T:ins and T:del pseudo-elements[edit]

The purpose of this is to allow using for example a "TNT|" prefix in a source translation which would be removed from all generated subpages named with the /code suffix.

Let's say we want to make the following template "Mytemplate", translatable using a "/layout" subtemplate (we want to create "MyTemplate/en" or "/fr" and so on from the "MyTemplate" used as the source):

{{
  MyTemplate/layout
  | param1 = some text...
}}<noinclude>{{documentation}}</noinclude>

We would first adapt this source page for translatability as:

{{
  <T:del>TNT|</>MyTemplate<!--T:ins>/layout<-->
  | param1 = <translate>some text...</translate>
}}<noinclude>{{documentation|Template:MyTemplate/doc}}</noinclude>

This means:

  • delete "TNT|" before the base template name
  • insert "/layout" after the base template name
  • translate "some text..." in the Translate tool interface used by translators, and insert its replacement in the generated subpages.
  • the template documentation (templates should have one) is shared between all its translated versions, by passing its doc name explicitly (the documentation may be translated separately using the same technics as the for base template!); this doc sharing is also useful of you want to create a "/sandbox" version of the base template, where it will display some basic autotests of rendering within the documentation itself (this documentation sharing may also be trancluded in the "/testcases" subpage using the same syntax of transclusion od the doc page, to exhibit the syntax to use in test cases).

When marking the source page for translation it becomes:

{{
  <T:del>TNT|</>MyTemplate<!--T:ins>/layout<-->
  | param1 = <translate><!--T:1-->
some text...</translate<
}}<noinclude>{{documentation|Template:MyTemplate/doc}}</noinclude>

Or it could also be:

{{
  <T:del>TNT|</>MyTemplate<!--T:ins:1>/layout<-->
  | param1 = <translate><!--T:2-->
some text...</translate<
}}<noinclude>{{documentation|Template:MyTemplate/doc}}</noinclude>

(so that if the "T:ins" is in the middle of a translate element, it will have an identifier usable in placeholders for editing translation units (here it would be the $1 placeholder)

Mediawiki still interprets this base source page without the full "T:ins" tag and its content, as it is within HTML comments, and will drop the leading "T:del" tag and its ending short tag (just like it drops the leading and final "translate" tags but keeps their contents), but not its content. So it will render for the base source page as if it was still like:

{{
  TNT|MyTemplate
  | param1 = some text...
}}<noinclude>{{documentation|Template:MyTemplate/doc}}</noinclude>

When saving the generated "/en" subpage for English, it becomes exactly like the original untagged and untranslated version:

{{
  MyTemplate/layout
  | param1 = some text...
}}<noinclude>{{documentation|Template:MyTemplate/doc}}</noinclude>

When we translate the English unit "some text..." into French "un peu de texte...", the generated "/fr" subpage also becomes:

{{
  MyTemplate/layout
  | param1 = un peu de texte...
}}<noinclude>{{documentation|Template:MyTemplate/doc}}</noinclude>

Note that with such design, we don't have to clutter the code of the "/layout" template:

  • the "/layout" template contains NO translate tags at all, it is never the base for translations;
  • the "/layout" template remains fully editable but contains parameters surrounding default values in English; like {{{param1|some text...}}} here;
  • the coders of the "/layout" template dont have to worry about the existence or not of other translations which are created independantly, the layout may then be blocked without blocking the Translate tool for the base template (if needed the base template could be bloked as well but not in cascade as it would forbid adding/modifying all translations;
  • and we can also still edit the /layout template as well as the base template to add new translatable items with defaults in English at any time. A translate admin can then mark only the base templates without having to look deeply into the ossibly complex "/layout" template.
  • we benefit of the separation of work between wiki template designers/coders, and the more competent translators in various languages. Translate admins don't have to inspect everything and don't need to be experts in Wiki-syntax: we can have translate admins for more languages.

Then we can invoke {{MyTemplate}} directly. Note that due to the fact that "TNT" is no longer referenced in the subpages, there will never be any self-recursion if we invoke {{MyTemplate/en}} to use the English version directly.

It then becomes possible to design the base template so that it will automatically use {{TNT}} where needed and without risk that it self-recurses within the translated subpages that TNT will attempt to locate and load, because the invokation of TNT is deleted in subpages !!!

Verdy p (talk) 03:29, 24 January 2014 (UTC)

Why we need these new pseudo-elements ?[edit]

On Meta-Wiki, where these possiblities have been experimented with success, the absence of support of these pseuo-elements means that sometimes templates using the aboce technics to be fully autotranslatable offer absolutely no way to determine what code will be be also generated in subpages. We could rename the base template to recreate anoher base template calling TNT with the renamed template in parameter.

However this multiplies the number of subpages, and this would still break the existing code that currently use randomly either:

  • {{MyTemplate}} or {{TNT|MyTemplate}} (both versions are working now interchangeably with the local version of the Lua module used on Meta-Wiki by m:Template:TNT)
  • or {{MyTemplate/en}} directly.

But in the later case,

  • when transcluding directly the "/en" version, it contains the same code as what is found in the base template,
  • so the "/en" template version so it will still attempt to use "TNT" on the base template name,
  • then TNT will resolve the template to use possibly as the English version, so it will attempt to reinclude the "/en" version: we get a self-recursion, and a script error

The self-recursion of the same template (forbidden in Media-Wiki) occurs when trying to use {{MyTemplate/en}} directly in a viewed page that is also in English. TNT has in this case absolutely no way to determne which other template to try, if it detects the error.

The direct tranclusion of the "/en" version in a (e.g.) German page will also use TNT that will resolve the translated template to use as "/de" version, there will be no self-recursion, but the template will appear in German even though the English version was requested !

The direct transclusion of the "/de" version in the same German page attempts to transclude directly the "/de" version of the template (it will also use TNT that will attempt to reinclude the same "/de" template), and will cause the self-recursion error, so this is absolutely not specific to English.

In addition, TNT is designed to render its given template preferably in the same language as the content language of the currently viewed page, not peferably the user language. Fallbacks will still apply to every prefered language specified or implied. So this is a generic problem.

We'll get the same problems if instead of using TNT with its default resolution, we pass it another prefered language. The effective target template (to be expanded by TNT), has no way of determining by itself in Media-Wiki syntax that it is being transcluded conditionally by TNT, possibly in a context where the same template is itself calling TNT! Simply because it cannot traceback its invokation chain before using TNT. MediaWiki also does not even allow the template to test this itself, the second invokation of the template will be blocked before its expansion, even if the parameters are different: MediaWiki forbids all self-recursions when expanding templates and block them early (including tail recursions which could produce infinite loops, very costly on server resources, even if they are given limited time and memory to complete or to be interrupted abruptly).

With "T;ins" and "T:del", it is simple to avoid making this test in the base template (the same test is also necessarily performed in the translated version, needlessly): the translated versions ("/en" included) to know if we must use TNT or not: the translated template will no longer call TNT, because this use of TNT has been removed in all the generated translated versions !!!

Alternative if we don't have these new pseudo-elements ?[edit]

Most often, trancluding {{MyTemplate/en}} directly is wrong when it should be {{MyTemplate}} only, so that English pages will get the English version of the template, German pages will get the German version...

This is eay to fix, but sometimes it may be desirable to force the language to render, using another syntax that will allow TNT to look for ths specific language version (before trying with fallbacks of this specified language): in that case the template should be invoked like {{MyTemplate|uselang=de}} with the language code we really want to render, but this requires modification of templates.

Forcing the language to transclude may be wrong as well for another reason: the translation may have been deleted or still not created, and language fallbacks should still be used to find the most relevant translation. Using TNT will still be useful in that case.

In summary, pages should never transclude directly any language-specific version of any translatable template, and should only use the base template, passing it a parameter for the prefered language. The same is true when creating links to other content pages which will be translated (links should use "Special:Mylanguage/" to resolve the target according to the preferred user language).

Verdy p (talk) 06:29, 24 January 2014 (UTC)

Bad section heading recommendations[edit]

The recommendation for sections is clearly wrong, recommending that a <translate>...</translate> span across multiple sections. This makes section editing a bit confusing in the source editor, in addition to making it much harder to drill down into editable bits in visual editor. --brion (talk) 17:05, 1 April 2016 (UTC)

The first "wrong" example appears to work just fine. If there's no objection I'll switch the recommendations. --brion (talk) 17:30, 1 April 2016 (UTC)
Per phab:T131516 the obviously-correct markup breaks because of the "markup for translate" step inserting a newline into the section heading. (It's fine until you run that.) It's not possible to manually remove the newlines because the extension demands the newlines be there or it rejects your edit. This is clearly incorrect and should be fixed in the extension. --brion (talk) 17:40, 1 April 2016 (UTC)
Quite a bold statement. This is not the only case of newlines being meaningful in wikitext, the most important example being lists. See the discussion I started a while ago, [Wikitext-l] Have multiline tags not terminate the list item until the tag is terminated. Nemo 08:32, 2 April 2016 (UTC)

In any case, I believe the text and the example on the page say 2 different things. --Elitre (WMF) (talk) 09:17, 30 June 2016 (UTC)

I don't know where the doubt about the recommended tagging for section headers comes from, as I have explained it in multiple places why it is the only way that works well. I have clarified the wording based on your feedback. --Nikerabbit (talk) 08:29, 5 October 2016 (UTC)
It still conflicts with "Headers can in principle be tied to the following paragraph, but it is better to have them separated. This way someone can quickly translate the table of contents before going into the contents." Also, it would entail forcing people to come up with a different translation for the header every time, when in several cases (like Tech News, newsletters etc.) there are consolidated machine memory translations which are usable with just one click. --Elitre (WMF) (talk) 13:26, 11 October 2016 (UTC)
You must have been reading an old version of the page. The new version says ”Headers can in principle be tied to the following paragraph, but it is better to have them separated with an empty line.”. I just re-marked the page for translation. --Nikerabbit (talk) 14:27, 11 October 2016 (UTC)
Yes. The given example though still includes extra text, so my remark stands :) I also believe that section editing (with the visual editor) works with at least one of the "wrong" examples. --Elitre (WMF) (talk) 14:33, 11 October 2016 (UTC)
Empty line is a translation unit separator, so it would in fact create two different translation units. Of course there are known issues with parsoid&VE which makes the whole tag thing ugly there and inadvertently section editing might also work due to that. --Nikerabbit (talk) 14:50, 11 October 2016 (UTC)

About marking outdated translations[edit]

With new version, phrase "...and outdated translations will be replaced temporarily with the original source text." is obsolete. Currently, outdated translations not replaced, but highlighted. --Kaganer (talk) 16:44, 16 January 2017 (UTC)

I've attempted to explain "Invalidation" in the page. (diff) It was outdated in that section, and unexplained in general. Please sanity-check (and improve or fix errors), and then mark for translation once it's good to go. Thanks! Quiddity (WMF) (talk) 23:29, 21 March 2017 (UTC)

Is there any way to convey additional information to the translator?[edit]

For example of what I mean, see this edit on Commons. Andrybak (talk) 11:52, 11 September 2018 (UTC)

Change content of the relevant /qqq page (ie. c:Translations:Template:Move/i18n/14/qqq). Matěj Suchánek (talk) 14:20, 11 September 2018 (UTC)
Matěj Suchánek, thank you! Andrybak (talk) 02:14, 13 September 2018 (UTC)

Beginner and lost - help?[edit]

Hello,

The translation extension is great, but the doc page is very hard to read for me. I was not able to get even some core infos. So, I am asking for help with 2 things:

  • What should I do after adding translate tags to a page? I have to directly contact a translation admin of my choosing? Currently, I am trying to mark d:Wikidata:WikiProject_COVID-19 for translation.
  • I started a beginners manual to the extension:translate, to make it easier for newcomers. If you want to help writing it, it would be super welcome: User:TiagoLubiana/Beginners_guide_to_translation. It is still largely a stub.

Best, TiagoLubiana (talk) 19:35, 16 April 2020 (UTC)

Is this really correct?[edit]

I belive the segment about images is wrong, I think one should never include a full image in a translation unit as that would only create more work for the translators. Would would ever be the point of that? Either remove the image from a translation unit or only mark the description for translation. --Sabelöga (talk) 11:22, 9 November 2020 (UTC)

Some images may be localized (e.g. screenshots). --Pols12 (talk) 21:05, 26 January 2021 (UTC)