Talk:Requests for comment/Scoped language converter

Jump to navigation Jump to search

Related discussions and resources:

What if rules conflict[edit]

If a page import a number of category rules, rules is imported category by category. Correct me if I got it wrong.

But there may exists rules conflicting with each other, is there any metarule about import order or ruleset priority?

— Preceding unsigned comment added by Fantasticfears (talkcontribs)

DON'T WORRY, we can connect with Chinese Wikipedia (from here) and ask for help.--Great Brightstar (talk) 16:57, 24 September 2013 (UTC)[reply]

To be discussed on April 2nd[edit]

We're going to be talking about this RfC in an architecture review meeting on 2 April, in IRC. Schedule & link coming soon. Sharihareswara (WMF) (talk) 22:53, 19 March 2014 (UTC)[reply]


  • "If multiple rules apply to the same text", longest-keys-preferred is a must, instead of "the last-specified one is applied"
    • When there're multiple keys with the same length, using the last as a tie-breaker is fine.
  • Definition of text source
    • "Foo {{see also|Bar}}" => "Foo See Also: Bar": Foo and Bar should be in the same context (scope) but See Also shouldn't.
  • Definition of position of conversion markup
    • Markups combined from different wikitext source not allowed anymore? => -{ {{echo|blah}} }-
    • Evaluation of affected templates needed, especially {{地区用词}}, {{cite *}} etc.

Liangent (talk) 08:28, 3 April 2014 (UTC)[reply]

Additional comments, from Liangent, pasted from email with her permission:
Not sure about the Urdu / Hindi case, but how did they get different ISO 639-1 codes if they're the same language? and what's the situation of North / South Korean?
* Goal of this RFC
I'm feeling a little confused now. I think that what I discussed with gwicke in the last Wikimania is a plan to describe current syntax in Parsoid DOM, so VE can read all current pages, optionally play with it, and finally put it back to save it (that is, to have wikitext with conversion syntax generated so it works again in PHP parser, in the current phase). However cscott pointed out, that creating a sane editor UI in VE with the current design is difficult, and once Parsoid becomes the main parser, conversion itself would become difficult too or impossible(?), so this RFC got drafted up, and new infrastructure and rules under this (or one of these) scheme are intended to be a *replacement* of all current stuff. Is this correct?
* Migration path
The proposed path should be: (1) Modify the PHP parser to understand new syntax (2) Make Parsoid understand the new syntax (3) Fix all wiki on-site content to use the new syntax. At this point, all contents in VE would appear correctly (4) - optional: Disable handling of old syntax in PHP parser, to verify everything works fine and no old syntax remains (5) Wait for full migration to Parsoid in the future (will it ever happen?).
I think step 1 would be very difficult, and needs heavy rewrite of the parser and/or converter (it doesn't have any concept of "scope") - not sure whether it worths - if (5) happens. Maybe another option is to skip step 1, by making Parsoid understand the new syntax, as well as the old syntax, which doesn't need to have its semantics fully implemented eg. just treat -{A| }- as a global rule, with gwicke's design. Details are not considered yet.
* And details of the new syntax
To be discussed later. I left some ideas on
Sharihareswara (WMF) (talk) 17:35, 7 April 2014 (UTC)[reply]

asking for comment[edit]

Requested comment on the mediawiki-i18n list. Sharihareswara (WMF) (talk) 20:21, 20 May 2014 (UTC)[reply]

Issue tracking[edit]

I still have difficulties grasping what's the extent of the issues and where they are. Well-organised bugzilla reports would help a great deal, we currently seem to have a huge moloch "support LC in parsoid" which doesn't help anything. See also bugzilla:41716#c47. --Nemo 15:32, 17 October 2014 (UTC)[reply]

Content Model for Rule Definition[edit]

For a while now, MediaWiki supports "content models". A content model is a "kind" of content - e.g. Wikitext, or JavaScript, or CSV, or SVG, etc. Whatever syntax is chosen to represent translation/transliteration rules, it should be backed by an appropriate ContentHandler for that model that would implement rendering, validation, indexing, etc. -- Daniel Kinzler (WMDE) (talk) 21:04, 22 October 2014 (UTC)[reply]